k8snetworkplumbingwg / multus-cni

A CNI meta-plugin for multi-homed pods in Kubernetes
Apache License 2.0
2.36k stars 585 forks source link

multus upgrade from 3.x to 4.x (thin plugin) causes pods start up issues #1218

Closed rgaduput closed 5 months ago

rgaduput commented 8 months ago

What happend: When we have upgraded the multus from v3.9 to latest v4.0.2 we see that all the pods failed to start in the "Initialize" phase. We understood that this only happens when plugin is upgraded. If done a fresh installation of multus v4.0.2 everything is fine. Plugin was upgraded to thin version, by applying the multus manifest file. (Also if upgraded from v3.9 to v3.9.3 no issues found)

What you expected to happen: After multus upgrade pods to start normally without any issues.

How to reproduce it (as minimally and precisely as possible): Install K8S, Calico, Istio with CNI enabled, multus v3.9 and test below pod creation. Upgrade multus from 3.9 -> 4.0.2 (thin plugin) and try create pods again.

kubectl create namespace test
kubectl label namespace test istio-injection=enabled
kubectl -n test create deployment nginx --image=nginx
#pod will fail to start on init

Anything else we need to know?: Please note we have istio mesh being used in the environment of version 1.17.x. (Using Istio CNI feature instead of init side car container)

Environment:

Exceptions from Pod description: Attached. pod-description.txt

dougbtv commented 8 months ago

Looks like some pretty lengthy istio errors in the pod-description.txt

Command error output: xtables other problem: line 2 failed\"}\n{\"level\":\"error\",\"time\":\"20
24-01-30T10:49:42.993601Z\",\"msg\":\"Failed to execute: iptables-restore --noflush /tmp/iptables-rules-1706611782988461011.txt2479629019, exit status 1\"}\n"
rgaduput commented 8 months ago

@dougbtv true, but at i have checked the iptables file referenced in the exception and there are no issues with it. More over what we are trying to understand is even though Istio version, config, k8s cluster version and config etc remain same only issues are faced when multus upgraded from 3.9.x to 4.0.x. In all the other scenarios it works fine. So trying to understand if we miss config changes or anything else in this multus major upgrade.

multus upgrade 3.9 -> 3.9.3 : Works fresh installation of multus 4.0 : Works multus upgrade 3.9 -> 4.0.x: Fails

because of this i am not so sure if its actually a issue from Istio.

github-actions[bot] commented 5 months ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.