kubevirt / containerized-data-importer

Data Import Service for kubernetes, designed with kubevirt in mind.
Apache License 2.0
429 stars 269 forks source link

nbdkit: curl[1]: error: pread: HTTP response code said error: The requested URL returned error: 416 #3543

Open Icedroid opened 5 days ago

Icedroid commented 5 days ago

What happened: CDI Used Nbdkit + curl download large image file (6.6G) fails with error.

I1123 14:00:03.965068       7 importer.go:107] Starting importer
I1123 14:00:03.965718       7 importer.go:182] begin import process
I1123 14:00:03.965760       7 http-datasource.go:422] Attempting to HEAD "http://image-cache-old.ecf-production/ecx/public/windows_server_2019_x64_cn.qcow2" via http client
I1123 14:00:03.968045       7 http-datasource.go:434] GO CLIENT: key: Etag, value: ["ca465e790dfe115dc9b76789f47bf54b-7"]
I1123 14:00:03.968056       7 http-datasource.go:434] GO CLIENT: key: Last-Modified, value: [Wed, 17 Aug 2022 06:37:59 GMT]
I1123 14:00:03.968059       7 http-datasource.go:434] GO CLIENT: key: Connection, value: [keep-alive]
I1123 14:00:03.968062       7 http-datasource.go:434] GO CLIENT: key: Accept-Ranges, value: [bytes bytes]
I1123 14:00:03.968065       7 http-datasource.go:434] GO CLIENT: key: Server, value: [nginx/1.22.1]
I1123 14:00:03.968068       7 http-datasource.go:434] GO CLIENT: key: Date, value: [Sat, 23 Nov 2024 14:00:03 GMT]
I1123 14:00:03.968070       7 http-datasource.go:434] GO CLIENT: key: Content-Type, value: [application/octet-stream]
I1123 14:00:03.968072       7 http-datasource.go:434] GO CLIENT: key: Content-Length, value: [7003701248]
I1123 14:00:03.968076       7 http-datasource.go:454] Content length: 7003701248
I1123 14:00:03.968087       7 http-datasource.go:347] Attempting to get object "http://image-cache-old.ecf-production/ecx/public/windows_server_2019_x64_cn.qcow2" via http client
I1123 14:00:03.968300       7 data-processor.go:348] Calculating available size
I1123 14:00:03.968793       7 data-processor.go:356] Checking out block volume size.
I1123 14:00:03.968804       7 data-processor.go:373] Target size 107374182400.
I1123 14:00:03.968821       7 format-readers.go:99] constructReaders: checking compression and archive formats
I1123 14:00:03.968856       7 format-readers.go:108] found header of type "qcow2"
I1123 14:00:03.968874       7 nbdkit.go:288] Start nbdkit with: ['--foreground' '--readonly' '--exit-with-parent' '-U' '/tmp/nbdkit.sock' '--pidfile' '/tmp/nbdkit.pid' '--filter=readahead' '--filter=retry' '-r' 'curl' 'header=User-Agent: cdi-nbdkit-importer' 'retry-exponential=no' 'url=http://image-cache.test/public/windows_server_2019_x64_cn.qcow2']
I1123 14:00:03.969080       7 nbdkit.go:348] Waiting for nbdkit PID.
I1123 14:00:04.469695       7 nbdkit.go:369] nbdkit ready.
I1123 14:00:04.469714       7 data-processor.go:247] New phase: Convert
I1123 14:00:04.469724       7 data-processor.go:253] Validating image
I1123 14:00:04.469738       7 prlimit.go:129] Setting CPU limit to 30
I1123 14:00:04.469970       7 prlimit.go:163] Setting Address space limit to 1073741824
I1123 14:00:04.475115       7 nbdkit.go:332] Log line from nbdkit: nbdkit: curl[1]: error: readahead: warning: underlying plugin does not support NBD_CMD_CACHE or PARALLEL thread model, so the filter won't do anything
I1123 14:00:04.558585       7 nbdkit.go:332] Log line from nbdkit: nbdkit: curl[1]: error: pread: HTTP response code said error: The requested URL returned error: 416
I1123 14:00:06.598542       7 nbdkit.go:332] Log line from nbdkit: nbdkit: curl[1]: error: pread: HTTP response code said error: The requested URL returned error: 416
I1123 14:00:08.638301       7 nbdkit.go:332] Log line from nbdkit: nbdkit: curl[1]: error: pread: HTTP response code said error: The requested URL returned error: 416
I1123 14:00:10.756889       7 nbdkit.go:332] Log line from nbdkit: nbdkit: curl[1]: error: pread: HTTP response code said error: The requested URL returned error: 416
I1123 14:00:12.835941       7 nbdkit.go:332] Log line from nbdkit: nbdkit: curl[1]: error: pread: HTTP response code said error: The requested URL returned error: 416
I1123 14:00:14.956989       7 nbdkit.go:332] Log line from nbdkit: nbdkit: curl[1]: error: pread: HTTP response code said error: The requested URL returned error: 416
E1123 14:00:14.957559       7 prlimit.go:178] qemu-img failed output is:
E1123 14:00:14.957570       7 prlimit.go:179] 
E1123 14:00:14.957576       7 prlimit.go:180] qemu-img: Could not open 'nbd+unix:///?socket=/tmp/nbdkit.sock': Could not read L1 table: Input/output error

E1123 14:00:14.957593       7 prlimit.go:156] failed to kill the process; os: process already finished
E1123 14:00:14.957653       7 data-processor.go:243] qemu-img: Could not open 'nbd+unix:///?socket=/tmp/nbdkit.sock': Could not read L1 table: Input/output error
, qemu-img execution failed: exit status 1 Log line from nbdkit: nbdkit: curl[1]: error: readahead: warning: underlying plugin does not support NBD_CMD_CACHE or PARALLEL thread model, so the filter won't do anythingLog line from nbdkit: nbdkit: curl[1]: error: pread: HTTP response code said error: The requested URL returned error: 416Log line from nbdkit: nbdkit: curl[1]: error: pread: HTTP response code said error: The requested URL returned error: 416Log line from nbdkit: nbdkit: curl[1]: error: pread: HTTP response code said error: The requested URL returned error: 416Log line from nbdkit: nbdkit: curl[1]: error: pread: HTTP response code said error: The requested URL returned error: 416Log line from nbdkit: nbdkit: curl[1]: error: pread: HTTP response code said error: The requested URL returned error: 416Log line from nbdkit: nbdkit: curl[1]: error: pread: HTTP response code said error: The requested URL returned error: 416
Unable to convert source data to target format
kubevirt.io/containerized-data-importer/pkg/importer.(*DataProcessor).initDefaultPhases.func6
        pkg/importer/data-processor.go:207
kubevirt.io/containerized-data-importer/pkg/importer.(*DataProcessor).ProcessDataWithPause
        pkg/importer/data-processor.go:240
kubevirt.io/containerized-data-importer/pkg/importer.(*DataProcessor).ProcessData
        pkg/importer/data-processor.go:149
main.handleImport
        cmd/cdi-importer/importer.go:188
main.main
        cmd/cdi-importer/importer.go:148
runtime.main
        GOROOT/src/runtime/proc.go:271
runtime.goexit
        src/runtime/asm_amd64.s:1695

I exec into the importer-dv pod and open nbdkit debug error log:

nbdkit: curl[1]: debug: handshake complete, processing requests serially
nbdkit: curl[1]: debug: curl: pread count=512 offset=0
nbdkit: curl[1]: debug: curl: pread count=112 offset=0
nbdkit: curl[1]: debug: curl: pread count=192 offset=196608
nbdkit: curl[1]: error: pread: curl_easy_perform: HTTP response code said error: The requested URL returned error: 416 Requested Range Not Satisfiable
nbdkit: curl[1]: debug: sending error reply: Input/output error
nbdkit: curl[1]: debug: client sent NBD_CMD_DISC, closing connection
nbdkit: curl[1]: debug: curl: finalize
nbdkit: curl[1]: debug: curl: close

What you expected to happen: CDI should be able to download and convert image file to raw without nbdkit error issues.

curl -I http://image-cache.test/public/windows_server_2019_x64_cn.qcow2 return 200 and response header has supported Accept-Range. curl -o http://image-cache.test/public/windows_server_2019_x64_cn.qcow2 download success. curl -H "Range: bytes=0-1024" -I http://image-cache.test/public/windows_server_2019_x64_cn.qcow2 also reponse success.

k8s service image-cache.test is nginx , It cached the qcow2 file. nginx.conf as follow:

    proxy_cache_path /home/share/public_cache levels=1:2 keys_zone=public_cache:512m max_size=100g inactive=365d use_temp_path=off;

    resolver 114.114.114.114 valid=60s;

    server {
      listen 80;
      server_name image-cache.test.svc;

      proxy_set_header X-Real-IP $remote_addr;
      proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
      proxy_set_header X-Nginx-Proxy true;
      add_header Nginx-Cache "$upstream_cache_status";
      proxy_read_timeout 600s;
      proxy_send_timeout 600s;

      location ^~ /public/ {
        proxy_pass http://127.0.0.1:8081/public/;
        proxy_cache public_cache;
        proxy_cache_key  $uri$http_range;
        proxy_cache_valid 206 200 365d;

        proxy_force_ranges on;
        proxy_http_version 1.1;
        proxy_set_header Connection "";

        proxy_set_header Range $http_range;
        proxy_set_header If-Range $http_if_range;

        add_header Accept-Ranges bytes always;
        add_header Content-Length $upstream_http_content_length always;

        proxy_hide_header Content-Range;
        add_header Content-Range $upstream_http_content_range always;
      }

How to make nginx server support use nbdkit+curl convert qcow2 file stream writed to raw?

How to reproduce it (as minimally and precisely as possible): Ceph block storage (PVC in block mode) CDI v1.60.4

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: dv-test
  namespace: default
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 100Gi
  storageClassName: ceph-rbd-block
  volumeMode: Block  
---
apiVersion: v1
kind: Pod
metadata:
  name: importer-dv-test
  namespace: default
  labels:
    app: containerized-data-importer
    app.kubernetes.io/component: storage
    app.kubernetes.io/managed-by: cdi-controller
    cdi.kubevirt.io: importer
    prometheus.cdi.kubevirt.io: 'true'
spec:
  volumes:
    - name: cdi-data-vol
      persistentVolumeClaim:
        claimName: dv-test
  containers:
    - name: importer
      image: quay.io/kubevirt/cdi-importer:v1.60.4
      args: ["-v=3"]
      ports:
        - name: metrics
          containerPort: 8443
          protocol: TCP
      env:
        - name: IMPORTER_SOURCE
          value: http
        - name: IMPORTER_ENDPOINT
          value: >-
            http://image-cache.test/public/windows_server_2019_x64_cn.qcow2
        - name: IMPORTER_CONTENTTYPE
          value: kubevirt
        - name: IMPORTER_IMAGE_SIZE
          value: 12Gi
        - name: OWNER_UID
          value: 4cb27c62-4413-4fed-aa91-392eecd12a3e
        - name: FILESYSTEM_OVERHEAD
          value: '0'
        - name: INSECURE_TLS
          value: 'false'
        - name: IMPORTER_DISK_ID
        - name: IMPORTER_UUID
        - name: IMPORTER_READY_FILE
        - name: IMPORTER_DONE_FILE
        - name: IMPORTER_BACKING_FILE
        - name: IMPORTER_THUMBPRINT
        - name: HTTP_PROXY
        - name: HTTPS_PROXY
        - name: NO_PROXY
        - name: IMPORTER_CURRENT_CHECKPOINT
        - name: IMPORTER_PREVIOUS_CHECKPOINT
        - name: IMPORTER_FINAL_CHECKPOINT
        - name: PREALLOCATION
          value: 'false'
        - name: IMPORTER_PULL_METHOD
          value: node
      resources:
        limits:
          cpu: '16'
          memory: 16Gi
        requests:
          cpu: 100m
          memory: 60M
      volumeDevices:
        - name: cdi-data-vol
          devicePath: /dev/cdi-block-volume
      terminationMessagePath: /dev/termination-log
      terminationMessagePolicy: File
      imagePullPolicy: Always
  restartPolicy: OnFailure
  terminationGracePeriodSeconds: 30
  dnsPolicy: ClusterFirst
  nodeSelector:
    kubernetes.io/os: linux
    kubernetes.io/hostname: 192.168.212.17
  serviceAccountName: default
  serviceAccount: default
  securityContext:
    runAsUser: 0
    fsGroup: 107
  schedulerName: default-scheduler
  tolerations:
    - key: node.kubernetes.io/not-ready
      operator: Exists
      effect: NoExecute
      tolerationSeconds: 300
    - key: node.kubernetes.io/unreachable
      operator: Exists
      effect: NoExecute
      tolerationSeconds: 300
  priority: 0
  enableServiceLinks: true

Additional context: It appears using nbdkit to curl the file could be causing issues.

Suggestion: If I Don't use nbdkit to curl, When qcow2 convert raw I can't limit the write bandwitdh of ceph block cluster. High write bandwitdh has affected other ceph block.

Environment:

CDI version (use kubectl get deployments cdi-deployment -o yaml): v1.60.4 Kubernetes version (use kubectl version): v1.24.0 or v1.18.0 DV specification: N/A Cloud provider or hardware configuration: N/A OS (e.g. from /etc/os-release): CentOS 7 Kernel (e.g. uname -a): 4.18 Install tools: N/A Others: ceph:v16.2.7, rook/ceph:v1.9.2

akalenyu commented 5 days ago

So I am a little confused about how the DV request looks like. I see that the IMPORTER_PULL_METHOD is node but in that case the endpoint should not be

name: IMPORTER_ENDPOINT
          value: >-
            http://image-cache.test/public/windows_server_2019_x64_cn.qcow2

Could you please attach the DataVolume? (node pull does an import from a second container in the same pod, so http://localhost:8100/disk.img)

Icedroid commented 4 days ago

So I am a little confused about how the DV request looks like. I see that the IMPORTER_PULL_METHOD is node but in that case the endpoint should not be

name: IMPORTER_ENDPOINT
          value: >-
            http://image-cache.test/public/windows_server_2019_x64_cn.qcow2

Could you please attach the DataVolume? (node pull does an import from a second container in the same pod, so http://localhost:8100/disk.img)

I create the importer pod manual. I want to use nbd-kit+curl for qemu info and qemu convert, I found the new version has IMPORTER_PULL_METHOD must be node, So I add the env IMPORTER_PULL_METHOD. How should I configure nginx server for test nbdkit+curl ?

akalenyu commented 2 days ago

We were hitting quite some issues for external mirrors with nbdkit, so the only path that still uses it is the node pull registry import (delegate import to container runtime) where we control the http server that's serving the image.

We don't really support creating the pod manually, there are just too many moving parts. Why do you have to use nbdkit?