I work on the Calico team, we had a customer who was using the "thick" plugin deployment style and they had an outage. We traced the problem down to the faact that the thick plugin YAML has a CPU reservation that's only big enough for multus itself. Calico's CNI plugin needs another ~50MB for itself.
What you expected to happen:
Perhaps the thick plugin YAML should have no CPU limit, or it should have notes in the docs explaining that you need to increase it to make room for the delegate plugin.
How to reproduce it (as minimally and precisely as possible):
Use thick plugin deployment, run a delegate plugin that uses lots of RAM. Get OOMs.
What happend:
I work on the Calico team, we had a customer who was using the "thick" plugin deployment style and they had an outage. We traced the problem down to the faact that the thick plugin YAML has a CPU reservation that's only big enough for multus itself. Calico's CNI plugin needs another ~50MB for itself.
What you expected to happen:
Perhaps the thick plugin YAML should have no CPU limit, or it should have notes in the docs explaining that you need to increase it to make room for the delegate plugin.
How to reproduce it (as minimally and precisely as possible):
Use thick plugin deployment, run a delegate plugin that uses lots of RAM. Get OOMs.
Anything else we need to know?:
Proposed a change to the docs here: https://github.com/k8snetworkplumbingwg/multus-cni/pull/1115
Environment:
kubectl version
):kubectl get net-attach-def -o yaml
)kubectl get pod <podname> -o yaml
)