kubevirt / kubevirt.github.io

KubeVirt website repo, documentation at https://kubevirt.io/user-guide/
https://kubevirt.io
MIT License
29 stars 109 forks source link

Unable to schedule VM on kind #859

Closed KIRY4 closed 11 months ago

KIRY4 commented 1 year ago

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug

What happened: Hello! I went through following steps: https://kubevirt.io/quickstart_kind/. Successfully installed kind, kubevirt operator and crd's. Now I'm trying to deploy VM following this manual: https://kubevirt.io/labs/kubernetes/lab1. And VM stuck in following state.

k get vm -n kubevirt 
NAME     AGE     STATUS               READY
testvm   8m44s   ErrorUnschedulable   False

k get vmi -n kubevirt
NAME     AGE    PHASE        IP    NODENAME   READY
testvm   8m4s   Scheduling                    False

k get vmis -n kubevirt
NAME     AGE    PHASE        IP    NODENAME   READY
testvm   8m7s   Scheduling                    False
k describe vm testvm -n kubevirt

    Message:               0/1 nodes are available: 1 Insufficient devices.kubevirt.io/kvm. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.

Events:
  Type    Reason            Age   From                       Message
  ----    ------            ----  ----                       -------
  Normal  SuccessfulCreate  10m   virtualmachine-controller  Started the virtual machine by creating the new virtual machine instance testvm

What you expected to happen: VM should successfully scheduled.

Anything else we need to know?:

URL where the problem can be found ... If the issue is with a lab, please provide information about your environment, platform, versions, ...

Environment: MacBook Pro (2019), Monterey 12.3.1. kind version 0.14.0, kubectl - v1.24.3, kind/ks8 - v1.24.0

cwilkers commented 1 year ago

Did you enable nested virtualization per the instructions in the lab?

cwilkers commented 1 year ago

So it turns out the instructions for enabling nested virtualization were deprecated, and no longer work as intended. Check https://github.com/kubevirt/kubevirt.github.io/pull/869 for the latest instructions.

angelalukic commented 1 year ago

I am also encountering a similar issue on both the kind and minikube quickstarts and I'm pretty sure I have nested virtualisation enabled as per the below command output.

$ cat /sys/module/kvm_intel/parameters/nested
Y

Running kubectl -n kubevirt patch kubevirt kubevirt --type=merge --patch '{"spec":{"configuration":{"developerConfiguration":{"useEmulation":true}}}} does not fix the issue.

$ kubectl get vms
NAME     AGE   STATUS               READY
testvm   71m   ErrorUnschedulable   False

$ kubectl get vmis
NAME     AGE     PHASE        IP    NODENAME   READY
testvm   4m26s   Scheduling                    False
Status:
  Conditions:
    Last Probe Time:       2022-09-21T15:29:20Z
    Last Transition Time:  2022-09-21T15:29:20Z
    Message:               Guest VM is not reported as running
    Reason:                GuestNotRunning
    Status:                False
    Type:                  Ready
    Last Probe Time:       <nil>
    Last Transition Time:  2022-09-21T15:29:20Z
    Message:               0/1 nodes are available: 1 Insufficient devices.kubevirt.io/tun, 1 node(s) didn't match Pod's node affinity/selector. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling.
    Reason:                Unschedulable
    Status:                False
    Type:                  PodScheduled
  Created:                 true
  Printable Status:        ErrorUnschedulable
cwilkers commented 1 year ago

I am getting the same error trying the latest minikube and kubevirt versions. My minikube is using the kvm plugin, and even with the useEmulation developerConfiguration patched to true, I get a similar failure.

The minikube node "minikube" has the following status.capacity:

    capacity:
      cpu: "2"
      ephemeral-storage: 17784752Ki
      hugepages-2Mi: "0"
      memory: 3814152Ki
      pods: "110"

The virt-handler pod shows the following errors:

{"component":"virt-handler","hostname":"minikube","level":"info","pos":"virt-handler.go:197","timestamp":"2022-10-14T22:32:58.916744Z"}
W1014 22:32:58.918370    5755 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
{"component":"virt-handler","level":"info","msg":"set verbosity to 2","pos":"virt-handler.go:470","timestamp":"2022-10-14T22:32:58.932468Z"}
{"component":"virt-handler","level":"info","msg":"setting rate limiter to 5 QPS and 10 Burst","pos":"virt-handler.go:479","timestamp":"2022-10-14T22:32:58.938258Z"}
{"component":"virt-handler","level":"info","msg":"set verbosity to 2","pos":"virt-handler.go:470","timestamp":"2022-10-14T22:32:58.939378Z"}
{"component":"virt-handler","level":"warning","msg":"host model mode is expected to contain only one model","pos":"cpu_plugin.go:98","timestamp":"2022-10-14T22:32:58.944054Z"}
{"component":"virt-handler","level":"info","msg":"Starting domain stats collector: node name=minikube","pos":"prometheus.go:531","timestamp":"2022-10-14T22:32:58.945526Z"}
{"component":"virt-handler","level":"info","msg":"STARTING informer vmiInformer-sources","pos":"virtinformers.go:330","timestamp":"2022-10-14T22:32:58.945596Z"}
{"component":"virt-handler","level":"info","msg":"STARTING informer vmiInformer-targets","pos":"virtinformers.go:330","timestamp":"2022-10-14T22:32:58.945616Z"}
{"component":"virt-handler","level":"info","msg":"STARTING informer extensionsKubeVirtCAConfigMapInformer","pos":"virtinformers.go:330","timestamp":"2022-10-14T22:32:58.945624Z"}
{"component":"virt-handler","level":"info","msg":"STARTING informer CRDInformer","pos":"virtinformers.go:330","timestamp":"2022-10-14T22:32:58.945633Z"}
{"component":"virt-handler","level":"info","msg":"STARTING informer kubeVirtInformer","pos":"virtinformers.go:330","timestamp":"2022-10-14T22:32:58.945643Z"}
{"component":"virt-handler","level":"info","msg":"certificate with common name 'kubevirt.io:system:node:virt-handler' retrieved.","pos":"cert-manager.go:198","timestamp":"2022-10-14T22:32:58.945971Z"}
{"component":"virt-handler","level":"info","msg":"node-labeller is running","pos":"node_labeller.go:110","timestamp":"2022-10-14T22:32:58.946004Z"}
{"component":"virt-handler","level":"info","msg":"metrics: max concurrent requests=3","pos":"virt-handler.go:494","timestamp":"2022-10-14T22:32:58.946151Z"}
{"component":"virt-handler","level":"info","msg":"certificate with common name 'kubevirt.io:system:client:virt-handler' retrieved.","pos":"cert-manager.go:198","timestamp":"2022-10-14T22:32:58.947170Z"}
{"component":"virt-handler","level":"info","msg":"set verbosity to 2","pos":"virt-handler.go:470","timestamp":"2022-10-14T22:32:58.958817Z"}
{"component":"virt-handler","level":"info","msg":"setting rate limiter to 5 QPS and 10 Burst","pos":"virt-handler.go:479","timestamp":"2022-10-14T22:32:58.958865Z"}
{"component":"virt-handler","level":"info","msg":"Updating cluster config from KubeVirt to resource version '814'","pos":"configuration.go:320","timestamp":"2022-10-14T22:32:58.978027Z"}
{"component":"virt-handler","level":"info","msg":"set verbosity to 2","pos":"virt-handler.go:470","timestamp":"2022-10-14T22:32:58.978066Z"}
{"component":"virt-handler","level":"info","msg":"setting rate limiter to 5 QPS and 10 Burst","pos":"virt-handler.go:479","timestamp":"2022-10-14T22:32:58.978081Z"}
{"component":"virt-handler","level":"info","msg":"Starting virt-handler controller.","pos":"vm.go:1380","timestamp":"2022-10-14T22:32:59.136478Z"}
{"component":"virt-handler","level":"info","msg":"Starting a device plugin for device: kvm","pos":"device_controller.go:57","timestamp":"2022-10-14T22:32:59.136780Z"}
{"component":"virt-handler","level":"info","msg":"Starting a device plugin for device: tun","pos":"device_controller.go:57","timestamp":"2022-10-14T22:32:59.136883Z"}
{"component":"virt-handler","level":"info","msg":"Starting a device plugin for device: vhost-net","pos":"device_controller.go:57","timestamp":"2022-10-14T22:32:59.136951Z"}
{"component":"virt-handler","level":"info","msg":"refreshed device plugins for permitted/forbidden host devices","pos":"device_controller.go:320","timestamp":"2022-10-14T22:32:59.137047Z"}
{"component":"virt-handler","level":"info","msg":"enabled device-plugins for: []","pos":"device_controller.go:321","timestamp":"2022-10-14T22:32:59.137113Z"}
{"component":"virt-handler","level":"info","msg":"disabled device-plugins for: []","pos":"device_controller.go:322","timestamp":"2022-10-14T22:32:59.137181Z"}
{"component":"virt-handler","level":"info","msg":"refreshed device plugins for permitted/forbidden host devices","pos":"device_controller.go:320","timestamp":"2022-10-14T22:32:59.137255Z"}
{"component":"virt-handler","level":"info","msg":"enabled device-plugins for: []","pos":"device_controller.go:321","timestamp":"2022-10-14T22:32:59.137327Z"}
{"component":"virt-handler","level":"info","msg":"disabled device-plugins for: []","pos":"device_controller.go:322","timestamp":"2022-10-14T22:32:59.137397Z"}
{"component":"virt-handler","level":"info","msg":"set verbosity to 2","pos":"virt-handler.go:470","timestamp":"2022-10-14T22:32:59.139196Z"}
{"component":"virt-handler","level":"info","msg":"setting rate limiter to 5 QPS and 10 Burst","pos":"virt-handler.go:479","timestamp":"2022-10-14T22:32:59.140062Z"}
{"component":"virt-handler","level":"info","msg":"set verbosity to 2","pos":"virt-handler.go:470","timestamp":"2022-10-14T22:32:59.141161Z"}
{"component":"virt-handler","level":"info","msg":"setting rate limiter to 5 QPS and 10 Burst","pos":"virt-handler.go:479","timestamp":"2022-10-14T22:32:59.141451Z"}
{"component":"virt-handler","level":"error","msg":"Error starting kvm device plugin","pos":"device_controller.go:69","reason":"error registering with device plugin manager: rpc error: code = Unknown desc = failed to dial device plugin: context deadline exceeded","timestamp":"2022-10-14T22:33:10.142031Z"}
{"component":"virt-handler","level":"error","msg":"Error starting tun device plugin","pos":"device_controller.go:69","reason":"error registering with device plugin manager: rpc error: code = Unknown desc = failed to dial device plugin: context deadline exceeded","timestamp":"2022-10-14T22:33:10.145136Z"}
{"component":"virt-handler","level":"error","msg":"Error starting vhost-net device plugin","pos":"device_controller.go:69","reason":"error registering with device plugin manager: rpc error: code = Unknown desc = failed to dial device plugin: context deadline exceeded","timestamp":"2022-10-14T22:33:10.159337Z"}
cwilkers commented 1 year ago

Hello @KIRY4 @angelalukic ,

I was able to get kubevirt and the most recent minikube working, but there is definitely something wrong here that may well be down to my own workstation's OS version. Previous development of the minikube lab were done under Fedora 35 with minikube at around 1.25 and an old kubevirt in the 0.49 to 0.55 range.

I didn't go back that far in my testing, but on Fedora 36 with minikube 1.27.1, I could not get any of the recent releases of KubeVirt to work, even going back to 0.55.1 (using the kvm2 driver)

I did get the podman driver working with F36 and minikube 1.27.1 and the latest (0.58.0) KubeVirt.

mhenriks commented 1 year ago

Testing with Fedora 36, kvm2, latest stable minikube, and the referenced yaml from the original post (https://kubevirt.io/labs/manifests/vm.yaml).

The issue seems to be related to the masquerade network setting. Changing to bridge fixes the issue.

This appears to be the relevant error in virt-handler log when using masquerade:

{"component":"virt-handler","kind":"","level":"error","msg":"Synchronizing the VirtualMachineInstance failed.","name":"testvm","namespace":"default","pos":"vm.go:1781","reason":"failed to configure vmi network: setup failed, err: failed plugging phase1 at nic 'eth0': Critical network error: Couldn't configure ip nat rules","timestamp":"2022-10-19T15:54:06.792991Z","uid":"3f6205d8-8f9c-45de-bbd5-8e0040b29e1e"}
cwilkers commented 1 year ago

I can confirm that setting to bridge gets me up and running.

Others have mentioned that https://github.com/kubevirt/kubevirt/pull/8451 may fix KubeVirt's device initialization in the 1.25 version of Kubernetes we find in the latest Minikube.

kubevirt-bot commented 1 year ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

kubevirt-bot commented 1 year ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle rotten

cwilkers commented 1 year ago

I found an additional work-around in minikube that allows bridge and/or masquerade VMs to work. If you add a CNI when installing minikube, e.g. flannel, then masquerade works fine.

minikube start --cni=flannel

I haven't researched yet whether there is an equivalent one flag option for Kind to add a CNI, but I'll create a PR for our minikube detailing the cni.

/remove-lifecycle rotten

kubevirt-bot commented 1 year ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

kubevirt-bot commented 1 year ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle rotten

kubevirt-bot commented 11 months ago

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

/close

kubevirt-bot commented 11 months ago

@kubevirt-bot: Closing this issue.

In response to [this](https://github.com/kubevirt/kubevirt.github.io/issues/859#issuecomment-1654510338): >Rotten issues close after 30d of inactivity. >Reopen the issue with `/reopen`. >Mark the issue as fresh with `/remove-lifecycle rotten`. > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.