networkop / meshnet-cni

a (K8s) CNI plugin to create arbitrary virtual network topologies
BSD 3-Clause "New" or "Revised" License
116 stars 27 forks source link

Flannel as default CNI: 00-meshnet conf file has empty delegate #8

Closed tahir24434 closed 5 years ago

tahir24434 commented 5 years ago

Plugin does not create the right meshnet.conf file on worker and master nodes on ec2 based cluster with Flannel CNI (deployed using Kops). It generates below file

{ "cniVersion": "0.2.0", "name": "meshnet_network", "type": "meshnet", "delegate": {} }​

Expected one is something like below { "cniVersion": "0.2.0", "name": "meshnet_network", "type": "meshnet", "delegate": { "name": "cbr0", "type": "flannel", "forceAddress": true, "isDefaultGateway": true, "hairpinMode": true } }​

networkop commented 5 years ago

This must be caused by some bug in the entrypoint script somewhere here. There's some jq magic involved in getting the current CNI plugin config and inserting it into meshnet's delegate field, which is where it probably fails. Can you paste in the output of cat /etc/cni/net.d/* from any one of your nodes?

tahir24434 commented 5 years ago

00-meshnet.conf: { "cniVersion": "0.2.0", "name": "meshnet_network", "type": "meshnet", "delegate": { "name": "cbr0", "type": "flannel", "forceAddress": true, "isDefaultGateway": true, "hairpinMode": true } } 10-flannel.conf: { "name": "cbr0", "type": "flannel", "delegate": { "forceAddress": true, "isDefaultGateway": true, "hairpinMode": true } } meshnet.conf: { "cniVersion": "0.2.0", "name": "meshnet_network", "type": "meshnet", "delegate": { "name": "dind0", "bridge": "dind0", "type": "bridge", "isDefaultGateway": true, "ipMasq": true, "ipam": { "type": "host-local", "subnet": "10.244.1.0/24", "gateway": "10.244.1.1" } } }

Please note that original 00-meshnet.conf was as given below 00-meshnet.conf: { "cniVersion": "0.2.0", "name": "meshnet_network", "type": "meshnet", "delegate": {} }

tahir24434 commented 5 years ago

The issue, in case of flannel, is that Flannel's conf file does not have "plugin" keyword. So, it always returns empty. jq -s '.[1].delegate = (.[0].plugins[0])' /etc/cni/net.d/$existing /etc/cni/net.d/meshnet.conf | jq .[1] > /etc/cni/net.d/00-meshnet.conf

Replacign above with below fix the problem with Flannel. sudo jq -s '.[1].delegate = (.[0])' /etc/cni/net.d/$existing /etc/cni/net.d/meshnet.conf | jq .[1]

I don't have configuration files for rest of the plugins (weave|bridge|calico|contiv|cilium|cni|kindnet), so I am not sure that whether the above solution will work across the board or not? If you have those conf files handy, please paste it here. I'll test them and push the code.

networkop commented 5 years ago

Yeah, i think the problem here is that some plugins have define a list of plugins in their cni conf and in that case they will have this plugins keyword. I'll try to come up with some jq workaround for now, but will need to fix this properly in the future

networkop commented 5 years ago

Similar bug in Multus for future reference https://github.com/intel/multus-cni/issues/29

networkop commented 5 years ago

CNI spec reference https://github.com/containernetworking/cni/blob/master/SPEC.md#network-configuration-lists

networkop commented 5 years ago

@tahir24434 I've implemented the fix in meshnet:0.2.0 docker image. you can pull it from dockerhub now.

tahir24434 commented 5 years ago

Ah thanks. I just saw your message now.

networkop commented 5 years ago

that's cool. i like your changes better. i'll merge them into master. master branch should build the meshnet:latest container image.

tahir24434 commented 5 years ago

Thanks for quick resolution.