kubevirt / kubevirt

Kubernetes Virtualization API and runtime in order to define and manage virtual machines.
https://kubevirt.io
Apache License 2.0
5.57k stars 1.33k forks source link

qemu-kvm: -blockdev permission denied while opening the pvc device #11002

Closed snimje closed 9 months ago

snimje commented 9 months ago

What happened: env: KubeVirt 0.59.2 kubernetes 1.25 Ubuntu 22.04 CSI: rook-ceph

While deploying the VM following error appears in the kube-launcher pod log

{"component":"virt-launcher","level":"error","msg":"Direct IO check failed for /dev/datavolumevolume3","pos":"converter.go:427","reason":"open /dev/datavolumevolume3: permission denied","timestamp":"2024-01-11T08:23:29.283725Z"}

{"component":"virt-launcher","level":"error","msg":"internal error: qemu unexpectedly closed the monitor: 2024-01-11T08:23:30.351153Z qemu-kvm: -blockdev {\"driver\":\"host_device\",\"filename\":\"/dev/datavolumevolume3\",\"node-name\":\"libvirt-2-storage\",\"cache\":{\"direct\":false,\"no-flush\":false},\"auto-read-only\":true,\"discard\":\"unmap\"}: Could not open '/dev/datavolumevolume3': Permission denied","pos":"qemuProcessReportLogError:1971","subcomponent":"libvirt","thread":"82","timestamp":"2024-01-11T08:23:30.402000Z"}

What you expected to happen: The VM is supposed to come into Running state instead it moves into CrashLoopBackOff state.

How to reproduce it (as minimally and precisely as possible): on k8s 1.25.15 + kubeVirt 0.59.2 deploy rook-ceph CSI driver, create storageclass

apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
  name: replicapool
  namespace: rook-ceph
spec:
  failureDomain: host
  replicated:
    size: 3
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: rook-ceph-block
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: rook-ceph.rbd.csi.ceph.com
parameters:
  clusterID: rook-ceph
  pool: replicapool
  imageFormat: "2"
  imageFeatures: layering,fast-diff,object-map,deep-flatten,exclusive-lock
  csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
  csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
  csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner
  csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph
  csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
  csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
  csi.storage.k8s.io/fstype: ext4
allowVolumeExpansion: true
reclaimPolicy: Delete

Deploy the VM with manifest

apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  labels:
    kubevirt.io/vm: vm-cloudinit-3
  name: vm-cloudinit-3
  annotations:
    kubevirt.io/libvirt-log-filters: "2:qemu.qemu_monitor 3:*"
spec:
  dataVolumeTemplates:
  - metadata:
      name: vm-node-3-dv
    spec:
      pvc:
        accessModes:
        - ReadWriteMany
        resources:
          requests:
            storage: 15G
        storageClassName: rook-ceph-block
        volumeMode: Block
      source:
        http:
          url: http://cloud.centos.org/centos/7/images/CentOS-7-x86_64-GenericCloud-2111.qcow2
  running: true
  template:
    metadata:
      labels:
        kubevirt.io/vm: vm-cloudinit-3
    spec:
      domain:
        cpu:
          cores: 1
        devices:
          disks:
          - disk:
              bus: virtio
            name: datavolumevolume3
          - disk:
              bus: virtio
            name: cloudinit
          interfaces:
          - masquerade: {}
            name: default
        machine:
          type: ""
        resources:
          requests:
            memory: 2G
      terminationGracePeriodSeconds: 0
      volumes:
      - dataVolume:
          name: vm-node-3-dv
        name: datavolumevolume3
      - cloudInitNoCloud:
          userData: |-
            #cloud-config
            password: YourPassword
            ssh_pwauth: True
            manage_etc_hosts: true
            chpasswd: { expire: False }
        name: cloudinit
      networks:
      - name: default
        pod: {}

Additional context: The VM comes into running state when the pvc's accessMode is set to readwriteonce and volumeMode is not Block.

Environment:

vasiliy-ul commented 9 months ago

Hi @snimje, what is your kubernetes distro, and what container runtime are you using? You can try to follow this doc: https://github.com/kubevirt/containerized-data-importer/blob/main/doc/block_cri_ownership_config.md

Also, since it is Ubuntu host, it's worth checking if apparmor is blocking smth.

matthewei commented 9 months ago

I think you can use root to run virt-launcher-pod.

vasiliy-ul commented 9 months ago

Using rootfull VMs is not recommended in production. Besides, they will be dropped at some point.

snimje commented 9 months ago

I had commented a line in my NAD manifest while applying it. The Network Attach Definition manifest YAML was applied but the comment's hash caused trouble which I had overlooked in the logs while creating the pod. I recreated the NAD after removing the commented line and pod has come online. Thank you for chiming in and sharing your thoughts about my issue.