k8snetworkplumbingwg / multus-cni

A CNI meta-plugin for multi-homed pods in Kubernetes
Apache License 2.0
2.32k stars 582 forks source link

Thin plugin readiness check does not apply to config generation. #1306

Open yifeng-cerebras opened 2 months ago

yifeng-cerebras commented 2 months ago

What happend: In case of node reboot, multus thin plugin failed to detect the master cni (cilium) even though we specified the readiness file for cilium. We are using "auto" policy so it tries to find the first conf while cilium is not ready.

For thick plugin, it has the right implementation that readiness file missing will also block config generation. Probably something similar can be done for thin plugin.

What you expected to happen: Readiness file check also apply to config generation for thin plugin with auto policy.

How to reproduce it (as minimally and precisely as possible): Delete main cni plugin and restart thin multus pod.

Anything else we need to know?: We chose to use thin plugin as we discovered a more critical issue for thick. When pods are initializing, force kill the daemon multus pod will cause pod stuck at init stage forever even after multus daemon pod recovered. Probably worth another bug report. cc: @michaely-cb

Environment:

dougbtv commented 1 month ago

We chose to use thin plugin as we discovered a more critical issue for thick.

we'd like to know more about this, as well

dougbtv commented 1 month ago

If you can provide a procedure that we can use to both reproduce and test this, that would be very helpful