aws / amazon-vpc-cni-k8s

Networking plugin repository for pod networking in Kubernetes using Elastic Network Interfaces on AWS
Apache License 2.0
2.28k stars 742 forks source link

[ARM64 EKS Graviton] calico fails to deploy. #1047

Closed manjo-git closed 4 years ago

manjo-git commented 4 years ago

Deploying calico on EKS (Graviton m6g ARM64 servers) fails because the calico.yaml does not support multi-arch images.

$ kubectl logs -n kube-system calico-node-2bk59
standard_init_linux.go:211: exec user process caused "exec format error"
$ kubectl logs -n kube-system calico-typha-6587dd4b4c-692xn
standard_init_linux.go:211: exec user process caused "exec format error"

AWS Doc for Calico needs fixing: https://docs.aws.amazon.com/eks/latest/userguide/calico.html

The following changes are needed to calico.yaml for it to deploy on Graviton based EKS without errors.

--- calico_old.yaml 2020-06-22 12:22:52.992354791 -0500
+++ calico.yaml 2020-06-22 12:14:51.418249199 -0500
@@ -32,7 +32,7 @@ spec:
         # container programs network policy and routes on each
         # host.
         - name: calico-node
-          image: quay.io/calico/node:v3.13.0
+          image: calico/node:v3.13.0
           env:
             # Use Kubernetes API as the backing datastore.
             - name: DATASTORE_TYPE
@@ -546,7 +546,7 @@ spec:
       securityContext:
         fsGroup: 65534
       containers:
-        - image: quay.io/calico/typha:v3.13.0
+        - image: calico/typha:v3.13.0
           name: calico-typha
           ports:
             - containerPort: 5473
@@ -683,7 +683,7 @@ spec:
       nodeSelector:
         beta.kubernetes.io/os: linux
       containers:
-        - image: k8s.gcr.io/cluster-proportional-autoscaler-amd64:1.7.1
+        - image: k8s.gcr.io/cluster-proportional-autoscaler-arm64:1.7.1
           name: autoscaler
           command:
             - /cluster-proportional-autoscaler

After these changes were applied to calico.yaml I am able to successfully deploy calico.

NAMESPACE     NAME                                                  READY   STATUS    RESTARTS   AGE
kube-system   aws-node-4lcgh                                        1/1     Running   0          14m
kube-system   aws-node-crzzm                                        1/1     Running   0          14m
kube-system   aws-node-dmcrj                                        1/1     Running   0          14m
kube-system   calico-node-874n8                                     1/1     Running   0          26s
kube-system   calico-node-lxc56                                     1/1     Running   0          26s
kube-system   calico-node-zs4qf                                     1/1     Running   0          26s
kube-system   calico-typha-7b7d9b477d-p9dpv                         1/1     Running   0          24s
kube-system   calico-typha-horizontal-autoscaler-6d5fc6fd94-s9b7t   1/1     Running   0          24s
kube-system   coredns-59dd559b88-47c8b                              1/1     Running   0          29m
kube-system   coredns-59dd559b88-982pt                              1/1     Running   0          29m
kube-system   kube-proxy-dn6xx                                      1/1     Running   0          14m
kube-system   kube-proxy-dzhnd                                      1/1     Running   0          14m
kube-system   kube-proxy-k6b7h                                      1/1     Running   0          14m
manjo-git commented 4 years ago

I believe similar issues exist in Stars policy demo section of the same doc. and could use similar fixes.

kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/tutorials/stars-policy/manifests/00-namespace.yaml
kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/tutorials/stars-policy/manifests/01-management-ui.yaml
kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/tutorials/stars-policy/manifests/02-backend.yaml
kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/tutorials/stars-policy/manifests/03-frontend.yaml
kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/tutorials/stars-policy/manifests/04-client.yaml

Looks like missing arm64 support.

$ kubectl get pods --all-namespaces
NAMESPACE       NAME                                                  READY   STATUS             RESTARTS   AGE
client          client-jxvth                                          0/1     CrashLoopBackOff   4          2m12s
kube-system     aws-node-4lcgh                                        1/1     Running            0          44m
kube-system     aws-node-crzzm                                        1/1     Running            0          44m
kube-system     aws-node-dmcrj                                        1/1     Running            0          44m
kube-system     calico-node-874n8                                     1/1     Running            0          30m
kube-system     calico-node-lxc56                                     1/1     Running            0          30m
kube-system     calico-node-zs4qf                                     1/1     Running            0          30m
kube-system     calico-typha-7b7d9b477d-p9dpv                         1/1     Running            0          30m
kube-system     calico-typha-horizontal-autoscaler-6d5fc6fd94-s9b7t   1/1     Running            0          30m
kube-system     coredns-59dd559b88-47c8b                              1/1     Running            0          60m
kube-system     coredns-59dd559b88-982pt                              1/1     Running            0          60m
kube-system     kube-proxy-dn6xx                                      1/1     Running            0          44m
kube-system     kube-proxy-dzhnd                                      1/1     Running            0          44m
kube-system     kube-proxy-k6b7h                                      1/1     Running            0          44m
management-ui   management-ui-p2jbw                                   0/1     CrashLoopBackOff   4          2m33s
stars           backend-m5tjn                                         0/1     CrashLoopBackOff   4          2m31s
stars           frontend-9h954                                        0/1     CrashLoopBackOff   4          2m29s
$ kubectl logs -n management-ui management-ui-p2jbw
standard_init_linux.go:211: exec user process caused "exec format error"
anguslees commented 4 years ago

(cluster-proportional-autoscaler multi-arch image is tracked in https://github.com/kubernetes-sigs/cluster-proportional-autoscaler/issues/89 btw)

jayanthvn commented 4 years ago

Thanks @manjo-git . For the star policy demo changes, I have raised an issue with project calico.

Ref - https://github.com/projectcalico/calico/issues/3717

jayanthvn commented 4 years ago

arm64 calico changes are merged to aws/container-roadmap (https://github.com/aws/containers-roadmap/pull/953). Closing the issue.