solo-io / gloo

The Feature-rich, Kubernetes-native, Next-Generation API Gateway Built on Envoy
https://docs.solo.io/
Apache License 2.0
4.07k stars 434 forks source link

Apple Silicon Support for local development #5471

Open kevin-shelaga opened 2 years ago

kevin-shelaga commented 2 years ago

Is your feature request related to a problem? Please describe. Apple Silicon Support for local development

Currently the gateway proxy fails to start and crashloops

[2021-10-08 23:04:50.803][10][critical][assert] [external/envoy/source/common/signal/http://signal_action.cc:62] assert failure: sigaltstack(&stack, &previous_altstack_) == 0.

Describe the solution you'd like A clear and concise description of what you want to happen.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context or screenshots about the feature request here.

chrisgaun commented 2 years ago

K3d does not work on M1 chips

kevin-shelaga commented 2 years ago

Docker desktop for apple silicon 4.3.0 seems to fix this. Closing.

kdorosh commented 2 years ago

reopening as local devs onboarding are continuing to hit issues here (e.g. build our test assets and kind load for example.

the mesh team has already solved a lot of this pain and we can copy a lot of their solution

nfuden commented 2 years ago

Do you think publishing arm builds of glooctl would be an example that would fall under this umbrella or is the definition of done here scoped to having workarounds for any / every common dev task.

kdorosh commented 2 years ago

The immediate concern is to unblock new hires onboarding to the team, but I wouldn't consider this issue done until we handle both cases

EItanya commented 2 years ago

Unfortunately the issue is a bit different than mesh because of how gloo edge builds it's proxy container. In mesh we were lucky enough to not have to build off of any x86 images. Therefore I think we should probably open an issue in envoy-gloo to also build/release arm builds from there.

kevin-shelaga commented 2 years ago

I'm curious what issues people are running into with M1 macs. I've been building/running x86 and arm based containers without issue on my M1 Max.

kdorosh commented 2 years ago

I too would appreciate extra clarity here, but I believe it mostly pertains to building release assets locally and using them in local testing. Some extra thoughts can be found here https://soloio.slab.com/posts/m-1-local-development-intro-il9oevhq

jackstine commented 2 years ago

When building the docker images in solo-projects for M1 chips please reference changes made in the following solo-projects branch M1-fix-for-docker-image.

jackstine commented 2 years ago
  1. envoy-gloo-ee
    1. Update the image in cloudbuilders
    2. Update the build image envoyproxy/envoy-build-ubuntu so that it is built in arm64. This image will be used here
    3. we will need a built image in arm64 of gcr.io/$PROJECT_ID/envoy-build-ubuntu
    4. Need to update the based off arch, the build path ./linux/amd64/build_envoy_release_stripped/envoy it will be arm64. So make this dynamic
    5. we need an arm64 version of frolvlad/alpine-glibc currently none exist, so we will have to build one here this will apply to the following file
      1. This does throw errors when building, but finishes
    6. update the ENOVY-IMAGE tags in s-p to allow for both arm64 and amd64
  2. envoy-gloo
    1. update the build process similar to the cloudbuild.yaml
    2. update the image frolvlad/alpine-glibc
    3. make the build path dynamic, similar to envoy-gloo-ee
EItanya commented 2 years ago

Where are we going to actually build the binaries, that will determine the majority of the work. Does GCP support ARM now?

jackstine commented 2 years ago

No does not look like it. All the compute listed here are not ARM. I haven't found any from GCP. They might announce it soon Google I/O is in May?

Also building the base image frolvlad/alpine-glibc causes problems. Aaron and I have found a work around using ubuntu as the base image for now.

jackstine commented 2 years ago

Here are a few things to do for this epic

jackstine commented 2 years ago

Here are a list of outstanding fails in s-p that occur

2#

dlp tests xslt transformer [It] will transform xml -> json

 Message: "admission webhook \"gloo.gloo-system.svc\" denied the request: resource incompatible with current Gloo snapshot: [Validating v1.VirtualService failed: validating *v1.VirtualService name:\"vs\" namespace:\"gloo-system\": failed to validate Proxy with Gloo validation server: VirtualHost Error: ProcessingError. Reason: invalid virtual host [gloo-system_vs]: envoy validation mode output: Caught Segmentation fault, suspect faulting address 0xd0\nBacktrace (use tools/stack_decode.py to get line numbers):\nEnvoy version: e81851c7ba191e99ad4a9e13dfea1f7af42b7323/1.21.1/Distribution/RELEASE/BoringSSL\n#0: [0x4005f6c930]\n, error: signal: segmentation fault]",

resource for BoringSSL

jackstine commented 2 years ago

Here are a list of outstanding fails in gloo that occur.

E0505 13:45:50.362259   96093 portforward.go:406] an error occurred forwarding 60906 -> 9091: error forwarding port 9091 to pod 0ee8e81cc58c36b97883e10adb77cb18ab102a48c326134ee01eb0e57d69a50b, uid : failed to execute portforward in network namespace "/var/run/netns/cni-1a6947dd-c169-dc99-5e6c-29460769424f": failed to connect to localhost:9091 inside namespace "0ee8e81cc58c36b97883e10adb77cb18ab102a48c326134ee01eb0e57d69a50b", IPv4: dial tcp4 127.0.0.1:9091: connect: connection refused IPv6 dial tcp6 [::1]:9091: connect: connection refused 
E0505 13:45:50.362646   96093 portforward.go:234] lost connection to pod
fabioaraujopt commented 2 years ago
  1. envoy-gloo-ee

    1. Update the image in cloudbuilders
    2. Update the build image envoyproxy/envoy-build-ubuntu so that it is built in arm64. This image will be used here
    3. we will need a built image in arm64 of gcr.io/$PROJECT_ID/envoy-build-ubuntu
    4. Need to update the based off arch, the build path ./linux/amd64/build_envoy_release_stripped/envoy it will be arm64. So make this dynamic
    5. we need an arm64 version of frolvlad/alpine-glibc currently none exist, so we will have to build one here this will apply to the following file

      1. This does throw errors when building, but finishes
    6. update the ENOVY-IMAGE tags in s-p to allow for both arm64 and amd64
  2. envoy-gloo

    1. update the build process similar to the cloudbuild.yaml
    2. update the image frolvlad/alpine-glibc
    3. make the build path dynamic, similar to envoy-gloo-ee
  3. envoy-gloo-ee

    1. Update the image in cloudbuilders
    2. Update the build image envoyproxy/envoy-build-ubuntu so that it is built in arm64. This image will be used here
    3. we will need a built image in arm64 of gcr.io/$PROJECT_ID/envoy-build-ubuntu
    4. Need to update the based off arch, the build path ./linux/amd64/build_envoy_release_stripped/envoy it will be arm64. So make this dynamic
    5. we need an arm64 version of frolvlad/alpine-glibc currently none exist, so we will have to build one here this will apply to the following file

      1. This does throw errors when building, but finishes
    6. update the ENOVY-IMAGE tags in s-p to allow for both arm64 and amd64
  4. envoy-gloo

    1. update the build process similar to the cloudbuild.yaml
    2. update the image frolvlad/alpine-glibc
    3. make the build path dynamic, similar to envoy-gloo-ee

your links are all broken..

jackstine commented 2 years ago

Adding outstanding work on this. This has popped up in the most recent weeks.

ianmacclancy commented 2 years ago

Team meeting to summarize current issues - https://docs.google.com/document/d/16HZRkq-y3sq7olz8WCoYc2phdUMjyTraTh2R5fBq2Bo/edit?usp=sharing

sam-heilbron commented 1 year ago

The linked PRs have merged into the main branches on Gloo OSS and Gloo Enterprise. Now developers should be able to follow the same local development process, regardless of their machine. I'm moving this back to backlog, as the additional Apple Silicon support for all images will require further effort but is not something that is needed right now.

github-actions[bot] commented 3 months ago

This issue has been marked as stale because of no activity in the last 180 days. It will be closed in the next 180 days unless it is tagged "no stalebot" or other activity occurs.