Mirantis / virtlet

Kubernetes CRI implementation for running VM workloads
Apache License 2.0
739 stars 128 forks source link

tapfdmanager socket connection problem #887

Closed rbauduin closed 5 years ago

rbauduin commented 5 years ago

When running virtlet with k8s (v1.14 and v1.15, othet versions not tested) and calico, we encounter a problem starting up a vm. At the step RunPoDSandbox, it seems that the dial to the unix socket /var/lib/virtlet/tapfdmanager.sock fails to connect. This results in the error

Jun 20 11:16:22 kube-node.novalocal criproxy[27960]: I0620 11:16:22.786353 27960 proxy.go:110] FAIL: /runtime.v1alpha2.RuntimeService/RunPodSandbox(): "/run/virtlet.sock": rpc error: code = 2 desc = Error adding pod cirros-vm (fc36edbe-256d-4d7b-98fe-d7661ef90917) to CNI network: not connected

We have confirmed the unix socket is present at the correct path. Starting the example nginx container works, and virtletctl validation is fine:

 /usr/local/bin/virtletctl validate
Nodes with Virtlet: kube-node.novalocal
Creating syscheck pod on the node "kube-node.novalocal"
SysCheck pods on all the Virtlet nodes are running
Validation successful.
rbauduin commented 5 years ago

Seems the dial to the unix socket was working this morning. We didn't touch the setup since I posted this issue, so we don't know what might have been the cause. We'll report back if it occurs again (it didn't in a new setup from scratch).