squat / kilo

Kilo is a multi-cloud network overlay built on WireGuard and designed for Kubernetes (k8s + wg = kg)
https://kilo.squat.ai
Apache License 2.0
2.01k stars 120 forks source link

microk8s compatibility #53

Open carlosrmendes opened 4 years ago

carlosrmendes commented 4 years ago

Is in the roadmap the microk8s compatibility? Microk8s uses flannel as cni by default, I've tested on it but with no success. Kilo pods starts well and no errors are printed in the logs (w/ log-level=all) but the sudo wg command don't show anything, no public key nor endpoint... and in the node the kilo.squat.ai/wireguard-ip annotation shows no ip.

Can you please take a look on microk8s? I think is interesting now that microk8s stable version has the clustering option. Thanks in advance.

squat commented 4 years ago

I would love for kilo to work on microk8s :) the fact that no error is printed but the node is never fully configured suggests to me that kilo cannot find correct IPs assigned to the node's interfaces so the node is never ready as far as kilo is concerned. Can you post the annotations on the node? I don't have an Ubuntu box to test microk8s on but I'll see if I can spin one up this week and test myself :)

carlosrmendes commented 4 years ago

annotations on the node in region "cloud" (first) and a another node (second), on another NAT'ed network: image

wg and ip a output on both nodes (left: cloud; right: NAT'ed): image

kilo logs on both left: cloud; right: NAT'ed): image

what the message on the logs received incomplete node means?

squat commented 4 years ago

hi @carlosrmendes thanks a lot for that info, it's super helpful :)

The received incomplete node message means that when the kilo agent listed the nodes from the API, a node was missing some data and so it was not considered ready. The completeness check (https://github.com/squat/kilo/blob/master/pkg/mesh/mesh.go#L90-L94) looks for the following data:

carlosrmendes commented 4 years ago

yes, that is the problem... the nodes on microk8s don't have the podCIDR in their spec... :/

carlosrmendes commented 4 years ago

I think the podCIDR of the node is present in the /var/snap/microk8s/common/run/flannel/subnet.env file:

FLANNEL_NETWORK=10.1.0.0/16
FLANNEL_SUBNET=10.1.8.1/24
FLANNEL_MTU=8951
FLANNEL_IPMASQ=false
carlosrmendes commented 4 years ago

is manually set the podCIDR on nodes spec and the kilo starts working. :) it is possible to set the pod subnet in the as kg argument or env variable?

but is not getting the correct persistent-keepalive value... image

carlosrmendes commented 4 years ago

hi @carlosrmendes thanks a lot for that info, it's super helpful :)

The received incomplete node message means that when the kilo agent listed the nodes from the API, a node was missing some data and so it was not considered ready. The completeness check (https://github.com/squat/kilo/blob/master/pkg/mesh/mesh.go#L90-L94) looks for the following data:

  • endpoint
  • internal IP
  • public key
  • recent heartbeat
  • pod subnet From your screenshots, it seems like the first four are definitely present in the annotations. The pod subnet is taken from the node's spec. Can you verify that all of the nodes have been allocated a pod subnet? Please share the output of:
kubectl get nodes -o=jsonpath="{.items[*]['spec.podCIDR']}"

@squat take a look on: https://github.com/kubernetes/kubernetes/issues/57130

squat commented 4 years ago

@carlosrmendes thanks for posting about the persistent-keepalive! That was indeed a bug. It's now fixed in master: https://github.com/squat/kilo/commit/e4829832c509f13f45f13f5bb0ef2131394b49bf

carlosrmendes commented 4 years ago

perfect! thanks @squat 👌 and about the pod subnet discovery? it can only works reading the .spec.podCIDR from the node?

squat commented 4 years ago

yes, K8s still supports this today. I'm not sure what microk8s is doing, but pod CIDR allocation is turned on in the controller-manager by default on most kubernetes distributions. TAL at the controller-manager flags to enable this on microk8s --allocate-node-cidrs https://kubernetes.io/docs/reference/command-line-tools-reference/kube-controller-manager/#options

carlosrmendes commented 4 years ago

I already test that flag on controller-manager and yes, the podCIDR was set, but is different from the flannel SUBNET assigned to the node 😥

squat commented 4 years ago

That is quite weird. By default, Flannel actually requires the node.spec.podCIDR field to be set as well https://github.com/coreos/flannel/blob/master/subnet/kube/kube.go#L233-L235

squat commented 4 years ago

It's possible that microk8s has configured flannel to use etcd as the data store instead of kubernetes, in which case the pod cidr will not be used or taken from the node object but rather saved in etcd. It looks like that configuration info can be found on disk: https://microk8s.io/docs/configuring-services#snapmicrok8sdaemon-flanneld

carlosrmendes commented 4 years ago

yes, in microk8s flannel uses etcd, I tried with --kube-subnet-mgr flag but with that, flannel somehow needs authentication to make calls to the api server, because it is not running as pods (that can use service accounts)

JulienVdG commented 3 years ago

Hello @carlosrmendes , I don't know if this can help, I remember playing with Flanel and kubeadm some time ago.

I needed to give --pod-network-cidr=192.168.128.0/17 to kubeadm init and use the same value in flanel's net-conf.json for the Netwok key.

  net-conf.json: |
    {
      "Network": "192.168.128.0/17",
      "Backend": {
        "Type": "host-gw"
      }
    }

Maybe you need a similar config in microk8s so that flanel and the controller-manager uses the same CIDR.

kampsv commented 3 years ago

was this ever resolved ?

facutk commented 2 years ago

Is this still in the roadmap?

SFxLabs commented 1 year ago

Is there a way to adapt the get started offered on the website to microk8s? Does still Microk8s use flannel?