kubevirt / kubevirt.github.io

KubeVirt website repo, documentation at https://kubevirt.io/user-guide/
https://kubevirt.io
MIT License
29 stars 110 forks source link

Critical network error: Couldn't configure ip nat rules #845

Closed crasu closed 1 year ago

crasu commented 2 years ago

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

What happened: I tried to setup kubevirt with minikube by following this tutorial on Ubuntu (https://kubevirt.io/quickstart_minikube/). This results in the VM going into CrashLoopBackOff:

#minikube version
minikube version: v1.25.2
commit: 362d5fdc0a3dbee389b3d3f1034e8023e72bd3a7
$ minikube start --driver=kvm2  --kubernetes-version='v1.20.2' --cni=flannel  --memory=8gb --cpus=8  
$ export VERSION=v0.51.0 # newer versions fails to start the operator with k8s v1.20.2 but newest k8sversion+newest kubevirt version fails as well
$ kubectl create -f https://github.com/kubevirt/kubevirt/releases/download/$VERSION/kubevirt-cr.yaml
$ kubectl create -f https://github.com/kubevirt/kubevirt/releases/download/${VERSION}/kubevirt-operator.yaml
--- wait until everything is up and running ---
$ kubectl apply -f https://kubevirt.io/labs/manifests/vm.yaml

The handler stops the vm immediately and the following error messages show up in the handler log:

{"component":"virt-handler","kind":"","level":"info","msg":"migration is block migration because of cloudinitdisk volume","name":"testvm","namespace":"default","pos":"vm.go:2215","timestamp":"2022-05-31T16:03:33.110140Z","uid":"c4b5ff00-6881-446c-9d25-a6106538f82a"}
{"component":"virt-handler","level":"error","msg":"virt-launcher crashed due to a network error. Updating VMI testvm status to Failed","pos":"vm.go:1126","timestamp":"2022-05-31T16:03:33.122787Z"}
{"component":"virt-handler","level":"info","msg":"re-enqueuing VirtualMachineInstance default/testvm","pos":"vm.go:1344","reason":"failed to configure vmi network: setup failed, err: failed plugging phase1 at nic 'eth0': Critical network error: Couldn't configure ip nat rules","timestamp":"2022-05-31T16:03:33.139158Z"}
{"component":"virt-handler","kind":"","level":"info","msg":"VMI is in phase: Failed | Domain does not exist","name":"testvm","namespace":"default","pos":"vm.go:1553","timestamp":"2022-05-31T16:03:33.143312Z","uid":"c4b5ff00-6881-446c-9d25-a6106538f82a"}

What you expected to happen: The vm starts

Anything else we need to know?:

URL where the problem can be found ... If the issue is with a lab, please provide information about your environment, platform, versions, ...

shannonmitchell commented 2 years ago

Seeing the exact same thing using minikube v1.26.0 on fedora 26. I'm running kubevirt v0.55.0 using the minikube addon. Looking through the virt-handler logs showed some issues with the handler finding /var/run/libvirt/libvirt-* & /dev/mem files along with the same network issues you mention.

Looks like a possible permission issue. It works fine running as root and starting with 'minikube start --force', so that is my current workaround. It is not ideal though. Adding the user to the libvirt and kvm groups doesn't seem to help either.

stu-gott commented 1 year ago

/cc @phoracek

maiqueb commented 1 year ago

You're using an unsupported K8s version: v1.20.2

The KubeVirt project only supports the previous 3 kubernetes releases, 1.22 / 1.23 / 1.24.

I recommend you retry with a newer version.

Furthermore, I find suspicious that the error points specifically at configuring IPtables rules - not at creating the tap device / bridges - which are also privileged operations that occur before attempting to write the NAT rules.

crasu commented 1 year ago

At the time it failed with the newest version as well (see original report). Just tried 1.24 -> fails with the same message.

minikube start --kubernetes-version='v1.24.0' --driver=kvm2 --cni=flannel --memory=8gb --cpus=8

lucasgonze commented 1 year ago

What are your thoughts on the IPtables clue, @maiqueb ? What does it imply to you?

maiqueb commented 1 year ago

What are your thoughts on the IPtables clue, @maiqueb ? What does it imply to you?

We've seen issues in the past with nodes having old kernel versions. What is the kernel version of your host ?

I am thinking along the lines of this opened issue, where iptables (which is deprecated in KubeVirt) was being used instead of nftables - in this issue there's a twist: seems it can't use nftables nor iptables, which throws the Couldn't configure ip nat rules error.

... having said that, I do not know why iptables / nftables are failing for lack of permissions.

I'll try to reproduce this when I can spare some time.

kubevirt-bot commented 1 year ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

kubevirt-bot commented 1 year ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle rotten

kubevirt-bot commented 1 year ago

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

/close

kubevirt-bot commented 1 year ago

@kubevirt-bot: Closing this issue.

In response to [this](https://github.com/kubevirt/kubevirt.github.io/issues/845#issuecomment-1381687083): >Rotten issues close after 30d of inactivity. >Reopen the issue with `/reopen`. >Mark the issue as fresh with `/remove-lifecycle rotten`. > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.