kubevirt / containerized-data-importer

Data Import Service for kubernetes, designed with kubevirt in mind.
Apache License 2.0
420 stars 261 forks source link

Importer pod can not create /data/disk/img because of too large size for file raw #2419

Closed Huimintai closed 2 years ago

Huimintai commented 2 years ago

What happened: when I create a VM with a 2T size PV as data volume ,the importer pod can not run sucessfully with errors:

[root@TENCENT64 ~]# kubectl  logs  importer-datavolume1.demo2tb 
I0906 10:39:29.174894       1 importer.go:77] Starting importer
I0906 10:39:29.175383       1 importer.go:279] Space adjusted for filesystem overhead: 3044058071040.
I0906 10:39:29.175400       1 qemu.go:270] creating raw image with size 3044058071040, preallocation false
E0906 10:39:29.190205       1 prlimit.go:174] qemu-img failed output is:
E0906 10:39:29.190220       1 prlimit.go:175] Formatting '/data/disk.img', fmt=raw size=3044058071040

E0906 10:39:29.190224       1 prlimit.go:176] qemu-img: /data/disk.img: The image size is too large for file format 'raw'

E0906 10:39:29.190561       1 importer.go:287] exit status 1
qemu-img execution failed
kubevirt.io/containerized-data-importer/pkg/system.executeWithLimits
        pkg/system/prlimit.go:178
kubevirt.io/containerized-data-importer/pkg/system.ExecWithLimits
        pkg/system/prlimit.go:111
kubevirt.io/containerized-data-importer/pkg/image.(*qemuOperations).CreateBlankImage
        pkg/image/qemu.go:282
kubevirt.io/containerized-data-importer/pkg/image.CreateBlankImage
        pkg/image/qemu.go:271
main.createBlankImage
        cmd/cdi-importer/importer.go:280
main.handleEmptyImage
        cmd/cdi-importer/importer.go:127
main.main
        cmd/cdi-importer/importer.go:109
runtime.main
        GOROOT/src/runtime/proc.go:255
runtime.goexit
        GOROOT/src/runtime/asm_amd64.s:1581
could not create raw image with size 3044058071040 in /data/disk.img
kubevirt.io/containerized-data-importer/pkg/image.(*qemuOperations).CreateBlankImage
        pkg/image/qemu.go:285
kubevirt.io/containerized-data-importer/pkg/image.CreateBlankImage
        pkg/image/qemu.go:271
main.createBlankImage
        cmd/cdi-importer/importer.go:280
main.handleEmptyImage
        cmd/cdi-importer/importer.go:127
main.main
        cmd/cdi-importer/importer.go:109
runtime.main
        GOROOT/src/runtime/proc.go:255
runtime.goexit
        GOROOT/src/runtime/asm_amd64.s:1581

What you expected to happen: Does the data volume has limit size? I not sure the importer execute this command into pod:

qemu-img create -f raw /data/disk.img 2T
brybacki commented 2 years ago

Thanks for reporting an issue.

As you can see from the logs, it creates added some amount of overhead: I0906 10:39:29.175383 1 importer.go:279] Space adjusted for filesystem overhead: 3044058071040. and then it failed with error message from qemu: E0906 10:39:29.190224 1 prlimit.go:176] qemu-img: /data/disk.img: The image size is too large for file format 'raw'

But to know what is going on, we would need to have some more information.

  1. Versions for cdi and kubernetes (for cdi it can be and image name from kubectl describe pod <cdi-deployment-pod>)
  2. Resource yamls for DV and PVC.
brybacki commented 2 years ago

Additional question: why do create a FileSystem PVC and not a Block PVC?

mhenriks commented 2 years ago

@Huimintai, my guess is that you are running up against filesystem limitations. Is it a shared filesystem like nfs, cephfs, etc? I'd try getting CDI out of the picture and creating a pod with pvc in same storageclass to experiment by creating files with qemu-img and/or truncate.

Huimintai commented 2 years ago

Thanks for reporting an issue.

As you can see from the logs, it creates added some amount of overhead: I0906 10:39:29.175383 1 importer.go:279] Space adjusted for filesystem overhead: 3044058071040. and then it failed with error message from qemu: E0906 10:39:29.190224 1 prlimit.go:176] qemu-img: /data/disk.img: The image size is too large for file format 'raw'

But to know what is going on, we would need to have some more information.

  1. Versions for cdi and kubernetes (for cdi it can be and image name from kubectl describe pod <cdi-deployment-pod>)
  2. Resource yamls for DV and PVC.

Thanks for your quick reply.This is my pod and PVC specifications:

kubectl -n vm-manager describe pods cdi-deployment-647b9db8f5-pnp96
Name:                 cdi-deployment-647b9db8f5-pnp96
Namespace:            vm-manager
Priority:             1000000000
Priority Class Name:  kubevirt-cluster-critical
Node:                 30.64.217.62/30.64.217.62
Start Time:           Tue, 06 Sep 2022 14:52:25 +0800
Labels:               app=containerized-data-importer
                      app.kubernetes.io/component=storage
                      app.kubernetes.io/managed-by=cdi-operator
                      app.kubernetes.io/part-of=vm-manager
                      app.kubernetes.io/version=0.50.x
                      cdi.kubevirt.io=
                      operator.cdi.kubevirt.io/createVersion=v1.45.0
                      pod-template-hash=647b9db8f5
                      prometheus.cdi.kubevirt.io=true
Annotations:          tke.cloud.tencent.com/networks-status:
                        [{
                            "name": "cilium",
                            "interface": "eth0",
                            "ips": [
                                "172.16.0.70"
                            ],
                            "mac": "1e:d9:4c:37:34:6d",
                            "default": true,
                            "dns": {}
                        }]
Status:               Running
IP:                   172.16.0.70
IPs:
  IP:           172.16.0.70
Controlled By:  ReplicaSet/cdi-deployment-647b9db8f5
Containers:
  cdi-controller:
    Container ID:  containerd://17bdec8a3dc3b038e858af7366e0155fafc0bf6b9e2c07f4f1b254c11562f33e
    Image:         registry.tke.com/library/cdi-controller:v1.45.0
    Image ID:      registry.tke.com/library/cdi-controller@sha256:89d33ede0b48a43ad84536edd9a0ba19ca5e85a481d78a55d9d5fe66bb5d1ba4
    Port:          8080/TCP
    Host Port:     0/TCP
    Args:
      -v=1
    State:          Running
      Started:      Tue, 06 Sep 2022 14:52:40 +0800
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:      10m
      memory:   150Mi
    Readiness:  exec [cat /tmp/ready] delay=2s timeout=1s period=5s #success=1 #failure=3
    Environment:
      IMPORTER_IMAGE:           registry.tke.com/library/cdi-importer:v1.45.0
      CLONER_IMAGE:             registry.tke.com/library/cdi-cloner:v1.45.0
      UPLOADSERVER_IMAGE:       registry.tke.com/library/cdi-uploadserver:v1.45.0
      UPLOADPROXY_SERVICE:      cdi-uploadproxy
      PULL_POLICY:              IfNotPresent
      INSTALLER_PART_OF_LABEL:   (v1:metadata.labels['app.kubernetes.io/part-of'])
      INSTALLER_VERSION_LABEL:   (v1:metadata.labels['app.kubernetes.io/version'])
    Mounts:
      /var/run/ca-bundle/cdi-uploadserver-client-signer-bundle from uploadserver-client-ca-bundle (rw)
      /var/run/ca-bundle/cdi-uploadserver-signer-bundle from uploadserver-ca-bundle (rw)
      /var/run/cdi/token/keys from cdi-api-signing-key (rw)
      /var/run/certs/cdi-uploadserver-client-signer from uploadserver-client-ca-cert (rw)
      /var/run/certs/cdi-uploadserver-signer from uploadserver-ca-cert (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-4spjg (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  cdi-api-signing-key:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  cdi-api-signing-key
    Optional:    false
  uploadserver-ca-cert:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  cdi-uploadserver-signer
    Optional:    false
  uploadserver-client-ca-cert:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  cdi-uploadserver-client-signer
    Optional:    false
  uploadserver-ca-bundle:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      cdi-uploadserver-signer-bundle
    Optional:  false
  uploadserver-client-ca-bundle:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      cdi-uploadserver-client-signer-bundle
    Optional:  false
  kube-api-access-4spjg:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 60s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 60s
Events:                      <none>

The PV and PVC:

kubectl describe pv pvc-91c6662a-7865-4527-85f6-217f93fbb6c8 
Name:            pvc-91c6662a-7865-4527-85f6-217f93fbb6c8
Labels:          <none>
Annotations:     pv.kubernetes.io/provisioned-by: cephfs.csi.ceph.com
Finalizers:      [kubernetes.io/pv-protection]
StorageClass:    csi-cephfs-sc
Status:          Bound
Claim:           default/datavolume1.demo2tb
Reclaim Policy:  Delete
Access Modes:    RWX
VolumeMode:      Filesystem
Capacity:        3000Gi
Node Affinity:   <none>
Message:         
Source:
    Type:              CSI (a Container Storage Interface (CSI) volume source)
    Driver:            cephfs.csi.ceph.com
    FSType:            
    VolumeHandle:      0001-0024-beb5e502-a2f5-433c-9dba-bbc92293077e-0000000000000001-7bbd70ad-1f9e-11ed-b6a1-de432fedff83
    ReadOnly:          false
    VolumeAttributes:      clusterID=beb5e502-a2f5-433c-9dba-bbc92293077e
                           fsName=00000001-fs
                           mounter=fuse
                           storage.kubernetes.io/csiProvisionerIdentity=1659757852602-8081-cephfs.csi.ceph.com
                           subvolumeName=csi-vol-7bbd70ad-1f9e-11ed-b6a1-de432fedff83
                           subvolumePath=/volumes/csi/csi-vol-7bbd70ad-1f9e-11ed-b6a1-de432fedff83/b3edd070-590a-4f70-8804-5d19577ef379
Events:                <none>
kubectl describe pvc datavolume1.demo2tb
Name:          datavolume1.demo2tb
Namespace:     default
StorageClass:  csi-cephfs-sc
Status:        Bound
Volume:        pvc-91c6662a-7865-4527-85f6-217f93fbb6c8
Labels:        alerts.k8s.io/KubePersistentVolumeFillingUp=disabled
               app=containerized-data-importer
               app.kubernetes.io/component=storage
               app.kubernetes.io/managed-by=cdi-controller
               app.kubernetes.io/part-of=vm-manager
               app.kubernetes.io/version=0.50.x
Annotations:   cdi.kubevirt.io/storage.condition.running: false
               cdi.kubevirt.io/storage.condition.running.message:
                 Unable to create blank image: exit status 1 qemu-img execution failed kubevirt.io/containerized-data-importer/pkg/system.executeWithLimits...
               cdi.kubevirt.io/storage.condition.running.reason: Error
               cdi.kubevirt.io/storage.contentType: kubevirt
               cdi.kubevirt.io/storage.import.importPodName: importer-datavolume1.demo2tb
               cdi.kubevirt.io/storage.import.source: none
               cdi.kubevirt.io/storage.pod.phase: Running
               cdi.kubevirt.io/storage.pod.restarts: 3937
               cdi.kubevirt.io/storage.preallocation.requested: false
               pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: cephfs.csi.ceph.com
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      3000Gi
Access Modes:  RWX
VolumeMode:    Filesystem
Used By:       importer-datavolume1.demo2tb
Events:
  Type     Reason           Age                      From               Message
  ----     ------           ----                     ----               -------
  Warning  ErrImportFailed  3m34s (x38935 over 21h)  import-controller  Unable to create blank image: exit status 1 qemu-img execution failed kubevirt.io/containerized-data-importer/pkg/system.executeWithLimits   pkg/system/prlimit.go:178 kubevirt.io/containerized-data-importer/pkg/system.ExecWithLimits   pkg/system/prlimit.go:111 kubevirt.io/containerized-data-importer/pkg/image.(*qemuOperations).CreateBlankImage   pkg/image/qemu.go:282 kubevirt.io/containerized-data-importer/pkg/image.CreateBlankImage   pkg/image/qemu.go:271 main.createBlankImage   cmd/cdi-importer/importer.go:280 main.handleEmptyImage   cmd/cdi-importer/importer.go:127 main.main   cmd/cdi-importer/importer.go:109 runtime.main   GOROOT/src/runtime/proc.go:255 runtime.goexit   GOROOT/src/runtime/asm_amd64.s:1581 could not create raw image with size 3044058071040 in /data/disk.img kubevirt.io/containerized-data-importer/pkg/image.(*qemuOperations).CreateBlankImage   pkg/image/qemu.go:285 kubevirt.io/containerized-data-importer/pkg/image.CreateBlankImage   pkg/image/qemu.go:271 main.createBlankImage   cmd/cdi-importer/importer.go:280 main.handleEmptyImage   cmd/cdi-importer/importer.go:127 main.main   cmd/cdi-importer/importer.go:109 runtime.main   GOROOT/src/runtime/proc.go:255 runtime.goexit   GOROOT/src/runtime/asm_amd64.s:1581
Huimintai commented 2 years ago

@Huimintai, my guess is that you are running up against filesystem limitations. Is it a shared filesystem like nfs, cephfs, etc? I'd try getting CDI out of the picture and creating a pod with pvc in same storageclass to experiment by creating files with qemu-img and/or truncate.

We use cephfs to provide the storage.I try creating a nginx pod and then use truncate to create a big file and finally got this error:

image
Huimintai commented 2 years ago

@mhenriks @brybacki Thanks all your helps.I have found the root cause.It is because our cephfs limit a single file cannot exceed 1T.

brybacki commented 2 years ago

@Huimintai thanks for information.

Closing the issue.

Huimintai commented 2 years ago

@brybacki @mhenriks hello,I can not create the VMI successfully with 2T data volume with this error:

image

can you help to look at this thanks a lot:>