cybozu-go / coil

CNI plugin for Kubernetes designed for scalability and extensibility
Apache License 2.0
164 stars 20 forks source link

Bump code base to CNI 1.0.0 #181

Closed masap closed 2 years ago

masap commented 2 years ago

This PR does not update version of coil itself. Just updates only code base.

  1. IPConfig.Version is removed. So use trick to know the IP address version.
  2. ip.SetupVethWithName() now can receive container MAC address.
  3. type of protocolId is modified from int to netlink.RouteProtocol.
  4. Fix golang.org/x/sys to version v0.0.0-20210608053332-aa57babbf139 to avoid failure of make generate.
  5. Run go mod tidy -compat=1.17 at coil-migrator/.
  6. Set cniVersion to result to fix the coredns pod boot failure. e2e test fails with this message and
    
    ------------------------------
    Coil
    should allow pods on different nodes to communicate
    /home/honma/git/coil/v2/e2e/coil_test.go:75
    [It] should allow pods on different nodes to communicate
    /home/honma/git/coil/v2/e2e/coil_test.go:75
    STEP: creating the default pool
    STEP: creating pods

• Failure [301.185 seconds] Coil /home/honma/git/coil/v2/e2e/coil_test.go:21 should allow pods on different nodes to communicate [It] /home/honma/git/coil/v2/e2e/coil_test.go:75

Timed out after 300.001s. Expected success, but got an error: <*errors.errorString | 0xc000125040>: { s: "pod is not ready: httpd", } pod is not ready: httpd

/home/honma/git/coil/v2/e2e/coil_test.go:119

then coredns fails with message `result type supports [1.0.0] but unmarshalled CNIVersion is ""`.

$ kubectl -n kube-system describe pod/coredns-f9fd979d6-5rf8h Name: coredns-f9fd979d6-5rf8h Namespace: kube-system Priority: 2000000000 Priority Class Name: system-cluster-critical Node: coil-worker3/172.18.0.5 Start Time: Thu, 07 Oct 2021 17:14:29 +0900 Labels: k8s-app=kube-dns pod-template-hash=f9fd979d6 Annotations: Status: Pending IP: IPs: Controlled By: ReplicaSet/coredns-f9fd979d6 Containers: coredns: Container ID: Image: k8s.gcr.io/coredns:1.7.0 Image ID: Ports: 53/UDP, 53/TCP, 9153/TCP Host Ports: 0/UDP, 0/TCP, 0/TCP Args: -conf /etc/coredns/Corefile State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Limits: memory: 170Mi Requests: cpu: 100m memory: 70Mi Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5 Readiness: http-get http://:8181/ready delay=0s timeout=1s period=10s #success=1 #failure=3 Environment: Mounts: /etc/coredns from config-volume (ro) /var/run/secrets/kubernetes.io/serviceaccount from coredns-token-w4848 (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: config-volume: Type: ConfigMap (a volume populated by a ConfigMap) Name: coredns Optional: false coredns-token-w4848: Type: Secret (a volume populated by a Secret) SecretName: coredns-token-w4848 Optional: false QoS Class: Burstable Node-Selectors: kubernetes.io/os=linux Tolerations: CriticalAddonsOnly op=Exists node-role.kubernetes.io/master:NoSchedule node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message


Warning FailedScheduling 2m39s (x2 over 2m39s) default-scheduler 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate. Warning FailedScheduling 2m23s default-scheduler 0/2 nodes are available: 2 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate. Warning FailedScheduling 2m23s (x2 over 2m23s) default-scheduler 0/4 nodes are available: 4 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate. Warning FailedScheduling 2m1s (x2 over 2m12s) default-scheduler 0/4 nodes are available: 2 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate, 2 node(s) had taint {test: }, that the pod didn't tolerate. Normal Scheduled 110s default-scheduler Successfully assigned kube-system/coredns-f9fd979d6-5rf8h to coil-worker3 Warning FailedCreatePodSandBox 100s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "8a9207920dad5b3f576abe408f447b320fffc64c6f226f5d5ca95b79a6556053": failed to allocate address; aborting new block request: context deadline exceeded Warning FailedCreatePodSandBox 75s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "111fd4bff6752593254587ac47f20b58f698121392fad974f4c8b25cb55fc99d": failed to allocate address; aborting new block request: context deadline exceeded Warning FailedCreatePodSandBox 63s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "54ee329e6f4bd8c49990181bcc73329399398b696dc0129fd625cb8dab87b47a": failed to unmarshal result; result type supports [1.0.0] but unmarshalled CNIVersion is "" Warning FailedCreatePodSandBox 49s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "7eb17249a5655ded341da4d20bd1f89f1c417cb3b0b406084e21b4a15f52b351": failed to unmarshal result; result type supports [1.0.0] but unmarshalled CNIVersion is "" Warning FailedCreatePodSandBox 35s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "e8d17f2b6ef85ba079556c998fd52280a71bdffc789751f02b60617ed0d1474e": failed to unmarshal result; result type supports [1.0.0] but unmarshalled CNIVersion is "" Warning FailedCreatePodSandBox 22s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "2fec26ff441c64c380ab49c1514725b5c9440975ef539fec3a20938a9aa32221": failed to unmarshal result; result type supports [1.0.0] but unmarshalled CNIVersion is "" Warning FailedCreatePodSandBox 8s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "5a847a73b904ea22ce897a04f71e21702c6e53c983a359165d1bceda36e54a90": failed to unmarshal result; result type supports [1.0.0] but unmarshalled CNIVersion is ""



Signed-off-by: Masashi Honma <masashi.honma@gmail.com>
masap commented 2 years ago

Hello ! I am new to this project. I already ran make test. Should I run any other test ?

ymmt2005 commented 2 years ago

Thank you! I have allowed this PR to run CI, so please see the results.

masap commented 2 years ago

OK. Thanks !

masap commented 2 years ago

The failure of make generate seems to be caused by this issue https://githubmemory.com/repo/kubernetes-sigs/controller-tools/issues/613. So fix the golang.org/x/sys to version v0.0.0-20210608053332-aa57babbf139.

ysksuzuki commented 2 years ago

Thank you for your contribution! This PR is for #173, isn't it? I will take a look.

masap commented 2 years ago

Is this PR for #173

Yes, but just a part of it.

ysksuzuki commented 2 years ago

One more thing, we need to add 1.0.0 to support CNI v1.0.0 here

masap commented 2 years ago

Yes, It will be fixed also at 1.0.0.

masap commented 2 years ago

I mean next step.

ysksuzuki commented 2 years ago

Please resolve conflicts. We merged a security fix and another high-priority PR first. Sorry for bothering you.

masap commented 2 years ago

Rebased.

masap commented 2 years ago

The situation is little bit complicated. This is the summary.

  1. Adding CNI version to Result is needed to avoid the warning of coredns pod. result type supports [1.0.0] but unmarshalled CNIVersion is "".

  2. Remaining to 0.4.0 is needed to avoid the another warning of coredns pod. unsupported CNI result version "1.0.0".

ysksuzuki commented 2 years ago

Looks like this PR is ready to be merged. Do you still have any concerns?

masap commented 2 years ago

No, I do not.

ysksuzuki commented 2 years ago

Thank you for your contribution!