google / gvisor-containerd-shim

containerd shim for gVisor
https://gvisor.dev
Apache License 2.0
79 stars 30 forks source link

RuntimeHandler "runsc" not supported Kuberentes #46

Closed alexcpn closed 4 years ago

alexcpn commented 4 years ago

Installed gvisor/runsc, containerd, gvisor-containerd-shim and containerd-shim-runsc-v1 in all nodes of kubernetes cluster.

In all nodes I am able to start the nginx container in the sandbox as is documented in [1] with runsc.

Created the Runtimeclass gvisor using runsc as per [1]description and created a Pod with the gVisor Runtime Class [1]

However, the pod is not able to start; This is the event - RuntimeHandler "runsc" not supported

3m47s Warning FailedCreatePodSandBox pod/nginx-gvisor Failed to create pod sandbox: rpc error: code = Unknown desc = RuntimeHandler "runsc" not supported Kubernetes version below; Cluster installed via kubeadm.

Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.0", GitCommit:"70132b0f130acc0bed193d9ba59dd186f0e634cf", GitTreeState:"clean", BuildDate:"2019-12-07T21:20:10Z", GoVersion:"go1.13.4", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.0", GitCommit:"70132b0f130acc0bed193d9ba59dd186f0e634cf", GitTreeState:"clean", BuildDate:"2019-12-07T21:12:17Z", GoVersion:"go1.13.4", Compiler:"gc", Platform:"linux/amd64"}

[1] https://github.com/google/gvisor-containerd-shim/blob/master/docs/runtime-handler-shim-v2-quickstart.md

jonbudd commented 4 years ago

@alexcpn Just to confirm, did you update your containerd config.toml to include the entry for runsc, and restart containerd? At a glance, it seems like this file may not properly be configured, since that file is containerd's source of truth for what is considered a valid runtimehandler.

ianlewis commented 4 years ago

We also had a similar issue come up in the comments on #32. Though I don't think there was resolution.

If you were able to create an nginx sandbox using crictl then it should work. Please verify that the container is running in a sandbox:

sudo crictl exec ${CONTAINER_ID} dmesg | grep -i gvisor

Also, please verify that your Kubernetes cluster is using the containerd that you've set up to use runsc. I'm not sure off hand what kubeadm uses by default but you may need to pass it a couple parameters like --container-runtime=remote and --container-runtime-endpoint=unix:///var/run/containerd/containerd.sock in order for it to use your containerd.

See: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/kubelet-integration/#providing-instance-specific-configuration-details

ianlewis commented 4 years ago

I looked around a bit more and it looks like if the docker.sock exists then Docker takes precedence. https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#installing-runtime

The container runtime options I wrote above should fix it but just to be sure, could you check if Docker is installed? specifically if the /var/run/docker.sock file exists?

ianlewis commented 4 years ago

My bad, those options above are the options for the kubelet. For kubadm you want to use --cni-socket /var/run/containerd/containerd.sock

alexcpn commented 4 years ago

@alexcpn Just to confirm, did you update your containerd config.toml to include the entry for runsc, and restart containerd? At a glance, it seems like this file may not properly be configured, since that file is containerd's source of truth for what is considered a valid runtimehandler.

Yes this is done; have followed all documentation and also tested via circtl;

[root@azuretest-2 ~]# systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/kubelet.service.d
           └─0-containerd.conf
        /usr/lib/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: active (running) since Thu 2020-01-30 14:42:37 IST; 2min 12s ago
     Docs: https://kubernetes.io/docs/
 Main PID: 3058 (kubelet)
   CGroup: /system.slice/kubelet.service
           └─3058 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib...

[root@azuretest-2 ~]# cat  /proc/3058/net/unix 
Num       RefCount Protocol Flags    Type St Inode Path
ffff8801f5d19000: 00000002 00000000 00010000 0001 01 19515 private/verify
...
/var/run/docker/libcontainerd/docker-containerd.sock
/run/containerd/containerd.sock.ttrpc
....
ffff8800ba161800: 00000002 00000000 00010000 0001 01  8997 /run/systemd/journal/stdout
ffff880199435800: 00000002 00000000 00010000 0001 01 110119 /run/containerd/containerd.sock
alexcpn commented 4 years ago

My bad, those options above are the options for the kubelet. For kubadm you want to use --cni-socket /var/run/containerd/containerd.sock

My use case was to enable it in an already exsiting Kubernetes cluster. Yes Docker is installed. However I disabled docker service and started kubelet . Are you suggesting that I need to create a cluster afresh , Kubeadm --init .... We already have a cluster in production and wanted to run trusted workloads there along with untrusted ,that is use both runc and runsc via RuntimeClass for Pod

Update //correction

Yes Docker is installed. However I disabled docker service and started kubelet .

I had not disabled it; I had just stopped it. Now I disabled it and kubelet is not starting up

5.758328     749 client.go:75] Connecting to docker on unix:///var/run/docker.sock
Jan 31 11:09:25 azuretest-2.novalocal kubelet[749]: I0131 11:09:25.767230     749 client.go:104] Start docker client with request timeout=2m0s
Jan 31 11:09:25 azuretest-2.novalocal kubelet[749]: F0131 11:09:25.793103     749 server.go:273] failed to run Kubelet: failed to create kubelet: failed to get
Jan 31 11:09:25 azuretest-2.novalocal systemd[1]: kubelet.service: main process exited, code=exited, status=255/n/a

I guess earlier it was staritng the service; I will correct this and revert

ianlewis commented 4 years ago

Yeah, the kubelet is probably connecting to docker via docker's socket file.

Assuming you installed kubeadm via the apt or rpm package, try editing the /var/lib/kubelet/kubeadm-flags.env file. That is where kubeadm adds it's own options for the kubelet. I'm guessing you don't have the --container-runtime or --container-runtime-endpoint flags in there or they're pointing to the docker.sock file. Add those so that it looks something like the following and then restart the kubelet.

sudo cat /var/lib/kubelet/kubeadm-flags.env
KUBELET_KUBEADM_ARGS="--container-runtime=remote --container-runtime-endpoint=/var/run/containerd/containerd.sock --resolv-conf=/run/systemd/resolve/resolv.conf"

After that I think it should work.

ianlewis commented 4 years ago

@alexcpn Oh, and before you do that I recommend you make sure there aren't any running workloads on the node. So drain it using kubectl drain and make sure no pods are running on it via statefulsets etc.

Otherwise I think Kubernetes will get confused about the containers running on the node.

alexcpn commented 4 years ago

Working ! Thanks for the help

There was an error in my centos box first in kubelet start

Warning FailedCreatePodSandBox pod/nginx-gvisor Failed to create pod sandbox: open /run/systemd/resolve/resolv.conf: no such file or directory Found a related bug ? ; I added a symlink to /etc/resolv.conf and then kubelet started with containerd

Deployed the sample nginx-gvisor from the quickstart and found the cotainerid and checked as per runtimehandler shim v2 quickstart guide and it seems to run in runsc

[root@azuretest-2 ~]# CONTAINER_ID=a56c197fd2a6fcba9fc02d75f3552a81b4ed9af01b575ab7e2a8fbbbb6b12e85 [root@azuretest-2 ~]# sudo crictl exec ${CONTAINER_ID} dmesg | grep -i gvisor [ 0.000000] Starting gVisor... [root@azuretest-2 ~]# sudo crictl exec ${CONTAINER_ID} dmesg [ 0.000000] Starting gVisor... [ 0.490852] Letting the watchdogs out... [ 0.601648] Constructing home... [ 0.706648] Searching for socket adapter... [ 0.935407] Feeding the init monster... [ 1.072809] Creating process schedule... [ 1.115258] Synthesizing system calls... [ 1.474947] Daemonizing children... [ 1.551481] Moving files to filing cabinet... [ 1.631282] Mounting deweydecimalfs... [ 2.120791] Reading process obituaries... [ 2.613084] Ready!

also via systemd-cgls to check if it is indeed via runsc (as other containers are by runc) image

I guess if the document can be updated on certain steps to enable containerd with kubelet then it will be more easier to follow

ianlewis commented 4 years ago

oh, right. the --resolv-conf=/run/systemd/resolve/resolv.conf option in /var/lib/kubelet/kubeadm-flags.env that I wrote above wasn't important. It was just what I had when I was testing. You could leave it as whatever it was set to before.

I'll add some info to the doc later for folks who are using kubeadm. I think that it's pretty common that folks will install a cluster using kubeadm with Docker installed and that will make it difficult to use RuntimeClass and CRI runtime handlers.

Anyway, glad that you got it working!