aws / eks-anywhere

Run Amazon EKS on your own infrastructure 🚀
https://anywhere.eks.amazonaws.com
Apache License 2.0
1.96k stars 284 forks source link

Unable to install eks anywhere with kindnetd cni due to image tags in public.ecr.aws have been changed. #7264

Open eugenejen opened 8 months ago

eugenejen commented 8 months ago

What happened: kindnetd image tag has been changed on public.ecr.aws from v0.18.0 to v0.18.0-eks-a-45

What you expected to happen: This cause issue to install eks anywhere on docker when cillium cni has issue to start.

How to reproduce it (as minimally and precisely as possible): on clean docker factory setting. run

CLUSTER_NAME=mgmt eksctl anywhere generate clusterconfig $CLUSTER_NAME \ --provider docker > $CLUSTER_NAME.yaml

Then update mgmt.yaml from cilium to kindnetd Then run eksctl anywhere create cluster -f $CLUSTER_NAME.yaml

then use docker ps to inspect hang installation and you will see the image was unable to be pulled down for cni.

Anything else we need to know?:

Environment:

robertlcx commented 8 months ago

I'm also seeing the exact same issue. I'm trying to use Kindnetd because I cannot bootstrap the cluster with Cilium networking on my M1 Mac.

jaxesn commented 8 months ago

Ill take a look at this, I can repro it as well.

@robertlcx you should be able to create clusters with cilium on M1 Macs, I usually work on M1 mac as well.

eugenejen commented 8 months ago

@jaxesn just curious. I am using docker for mac on intel. But I see there seems issue to run ebpf on docker for mac? The reason we use kindinetd cni is I can get kind running. but i am having issue to run cilium on docker itself and also in eks anywhere (cilium just crashed).

It there any setting we need to adjust in docker for mac desktop to enable ebfp?

jaxesn commented 8 months ago

Oh thats odd, I don't believe so. I will try to today on my M1 with cilium to make sure that still works as expected. Ill see if someone can try on an intel mac to confirm as well.

What version of docker do you have installed?

eugenejen commented 8 months ago

@jaxesn I am using docker desktop for mac intel version

version 4.26.0 (130397)

Engine: 24.0.7

Compose: v2.23.3-desktop.2

Credential Helper: v0.7.0

Kubernetes: v1.28.2

jaxesn commented 8 months ago

I am seeing the same bpf issue on my mac as well. This is "newish", we've def seen this work in the past. Ill do a like poking around to see if there is a workaround.

@abhay-krishna also fixed the kindnetd manifest for our 0.18.x releases so you should be able to create docker clusters using kindnetd now.

jaxesn commented 8 months ago

I think this is the same issue: https://github.com/kubernetes/minikube/issues/17780

Try downgrading to the 4.25.x release of docker for mac.

abhay-krishna commented 8 months ago

@eugenejen @robertlcx were you able to get further in your cluster creation?

eugenejen commented 8 months ago

@abhay-krishna i downgraded to 4.25.x. but now i am getting another issue on https://github.com/aws/eks-anywhere/issues/6678 and unable to resolved it.

abhay-krishna commented 8 months ago

@eugenejen did you try the workarounds suggested in that issue, particularly switching from VirtioFS to gRPC Fuse? You will find this option under the General tab in the Docker Desktop Settings menu.

eugenejen commented 8 months ago

@abhay-krishna i have verified that I can swithc to use gPRC Fuse and the issue is resolved and cilium is running.

robertlcx commented 6 months ago

@eugenejen @abhay-krishna managed to fix this by downgrading to 4.25.x, switching back to cilium, and using gRPC Fuse instead.

Up until a couple of days, my fix was running an older version of eksctl and eks-anywhere, but now, some of the older images have been yanked from their Docker registry, so it wasn't working anymore.