Closed Cellebyte closed 2 years ago
@Cellebyte
We try to reproduce the issue following procedure with the environment of coil e2e test. We referenced these documents. https://linkerd.io/2.11/features/cni/#using-the-cli. https://linkerd.io/2.11/getting-started/
$ make start
$ make install-coil
$ kubectl apply -f manifests/default_pool.yaml
$ linkerd install-cni | kubectl apply -f -
$ linkerd install --linkerd-cni-enabled | kubectl apply -f -
$ curl --proto '=https' --tlsv1.2 -sSfL https://run.linkerd.io/emojivoto.yml | kubectl apply -f -
$ kubectl get -n emojivoto deploy -o yaml | linkerd inject - | kubectl apply -f -
$ kubectl rollout restart daemonset -n kube-system coild
$ linkerd check
$ linkerd -n emojivoto check --proxy
How can we confirm “linkerd-cni not working anymore” status?
Coil should check if the cni config got updated and only change the parts which are referenced in the initial configmap. It should also subscribe to file changes and do the check accordingly. It should also imply to be the first plugin in the list and only update the parts from the initial configmap.
We don’t intend to add any more complex function about cni config. The coil-installer can generate from the edited cni file. So you can add the setttings for linkerd to cni file, and let coil-installer generate it. ref: https://github.com/cybozu-go/coil/blob/14c0b64b7ae7f713e62634ca35db156a9666f3ee/docs/setup.md#edit-netconfjson
Also add an option not to use coil-controller if required. And you can generate cni file with linkerd.
@yamatcha I think both ways make sense either adding a disable flag for coil or a disable flag for linkerd 👍 .
@yamatcha I think we need to start also testing again on our side to provide more feedback to you. The general issue was that coil stopped working after restart of the pods when linkerd was installed and that can be fixed by letting one of the two cnis applying the cni config as you already suggested.
@Cellebyte The option of not using a coil controller has not yet been implemented. If you need it, let us know and we will try to implement it.
Describe the bug When I install an additional cni e.g. linkderd-cni. To accomplish more complex Network functionality.
I get race conditions between the 2 cnis. Either both work or the other cni is broken.
Environments
To Reproduce Steps to reproduce the behavior:
Expected behavior Both CNI plugins should coexist. Coil should check if the cni config got updated and only change the parts which are referenced in the initial configmap. It should also subscribe to file changes and do the check accordingly. It should also imply to be the first plugin in the list and only update the parts from the initial configmap.
Additional context For this to be possible there need to be a fundamental change in how the cni configuration gets applied on initialization. So that coil is not bricking other cnis.
We currently need it to accomplish cross region encrypted connectivity between clusters and don't want to use the sidecar approach for it so we looked into a cni approach for a servicemesh and we discovered this race condition bug.