kubevirt / containerized-data-importer

Data Import Service for kubernetes, designed with kubevirt in mind.
Apache License 2.0
426 stars 267 forks source link

CDI import from URL is significantly slower than a manual wget + virtctl image-upload #2809

Closed cloudymax closed 1 year ago

cloudymax commented 1 year ago

What happened: There is a massive performance variance between CDI image import methods which makes the declarative workflow for PVCs/DataVolumes unusable. I'm REALLY trying to make Kubevirt work as a replacement for qemu wrapped in bash but the slowness of drive creation is making it impossible to achieve buy-in from others.

You can find my full setup here: https://github.com/cloudymax/argocd-nvidia-lab/tree/main/kubevirt

Disk configs are specifically here: https://github.com/small-hack/argocd/tree/main/kubevirt/disks

CDI configuration is here: https://github.com/small-hack/argocd/tree/main/kubevirt/cdi

This has been tested across multiple hardware types:

Method: wget + virtctl Storage Class: local-path Time: 22 seconds

Method: wget + virtctl Storage Class: longhorn Time: 56 seconds

Command used:

export VOLUME_NAME=debian12-pvc
export NAMESPACE="default"
export STORAGE_CLASS="longhorn"
export ACCESS_MODE="ReadWriteMany"
export IMAGE_URL="https://cloud.debian.org/images/cloud/bookworm/daily/latest/debian-12-generic-amd64-daily.qcow2"
export IMAGE_PATH=debian-12-generic-amd64-daily.qcow2
export VOLUME_TYPE=pvc
export SIZE=120Gi
export PROXY_ADDRESS=$(kubectl get svc cdi-uploadproxy-loadbalancer -n cdi -o json | jq --raw-output '.spec.clusterIP')
# $(kubectl get svc cdi-uploadproxy -n cdi -o json | jq --raw-output 

time wget -O $IMAGE_PATH $IMAGE_URL && \
time virtctl image-upload $VOLUME_TYPE $VOLUME_NAME \
    --size=$SIZE \
    --image-path=$IMAGE_PATH \
    --uploadproxy-url=https://$PROXY_ADDRESS:443 \
    --namespace=$NAMESPACE \
    --storage-class=$STORAGE_CLASS \
    --access-mode=$ACCESS_MODE \
    --insecure --force-bind

When using the declarative method the process takes over 20 minutes

Logs attached: Explore-logs-2023-07-15 10 36 08.txt

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: "debian"
  labels:
    app: containerized-data-importer
  annotations:
    cdi.kubevirt.io/storage.bind.immediate.requested: "true"
    cdi.kubevirt.io/storage.import.endpoint: "https://cloud.debian.org/images/cloud/bookworm/daily/latest/debian-12-generic-amd64-daily.qcow2"
spec:
  storageClassName: local-path
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 120Gi

What you expected to happen:

I would expect this process to be MUCH faster. For example, when performing this task using just plain QEMU + Bash it takes only a couple seconds, and the VM is booted and ready to login in ~30 seconds total on the same hardware mentioned above.

Source: https://github.com/cloudymax/Scrap-Metal/blob/main/virtual-machines/vm.sh

> bash vm.sh create-cloud-vm test
[2023-07-15 10:55:18] 📞 Setting networking options.
[2023-07-15 10:55:18]  - Static IP selected.
bridge exists
tap0 exists.
[2023-07-15 10:55:18] j🖥 Set graphics options based on gpu presence.
[2023-07-15 10:55:18]  - GPU attached
[2023-07-15 10:55:18] 📂 Creating VM directory.
[2023-07-15 10:55:18]  - Done!
[2023-07-15 10:55:18] 🔐 Create an SSH key for the VM admin user
[2023-07-15 10:55:20]  - Done.
[2023-07-15 10:55:20] ⬇️ Downloading cloud image...
debian-12-generic-amd64 100%[===============================>] 361.08M  67.0MB/s    in 8.3s
[2023-07-15 10:55:29] 📈 Expanding image
[2023-07-15 10:55:29]  - Done!
[2023-07-15 10:55:29] 👤 Generating user data
[2023-07-15 08:55:29] 🔎 Checking for required utilities...
[2023-07-15 08:55:29]  - All required utilities are installed.
[2023-07-15 08:55:29] 📝 Creating user-data file
[2023-07-15 08:55:29] 📝 Checking against the cloud-inint schema...
[2023-07-15 08:55:29] Valid cloud-config: user-data.yaml
[2023-07-15 08:55:29]  - Done.
[2023-07-15 10:55:30] 🌱 Generating seed iso containing user-data
[2023-07-15 10:55:30]  - Done!
[2023-07-15 10:55:30] 🌥 Creating cloud-image based VM
[2023-07-15 10:55:30] Watching progress:
[2023-07-15 10:55:54]  - Cloud-init complete.. 22.4 -----+ 3:ce:10:8c:52:c8:eb:9
download_cloud_image(){
  log "⬇️ Downloading cloud image..."
    wget -c -O "$CLOUD_IMAGE_NAME" "$CLOUD_IMAGE_URL" -q --show-progress
}

# Expand the size of the disk image 
expand_cloud_image(){
  log "📈 Expanding image"

  export CLOUD_IMAGE_FILE_TYPE=$(echo "${CLOUD_IMAGE_NAME#*.}")

  case $CLOUD_IMAGE_FILE_TYPE in
    "img")
      echo "img"
      qemu-img create -b ${CLOUD_IMAGE_NAME} -f qcow2 \
          -F qcow2 disk.qcow2 \
          "$DISK_SIZE" 1> /dev/null
      ;;
    "qcow2")
      echo "qcow2"
      qemu-img create -b ${CLOUD_IMAGE_NAME} -f qcow2 \
          -F qcow2 disk.qcow2 \
          "$DISK_SIZE" 1> /dev/null
      ;;
    *)
      echo "error"
      exit
  esac

  log " - Done!"
}
...

# start the cloud-init backed VM
create_ubuntu_cloud_vm(){
  log "🌥 Creating cloud-image based VM"
  if tmux has-session -t "${VM_NAME}_session" 2>/dev/null; then
    echo "session exists"
  else
    tmux new-session -d -s "${VM_NAME}_session"
    tmux send-keys -t "${VM_NAME}_session" "sudo qemu-system-x86_64  \
      -machine accel=kvm,type=q35 \
      -cpu host,kvm="off",hv_vendor_id="null" \
      -smp $SMP,sockets="$SOCKETS",cores="$PHYSICAL_CORES",threads="$THREADS",maxcpus=$SMP \
      -m "$MEMORY" \
      $VGA_OPT
      $PCI_GPU
      $NETDEV
      $DEVICE
      -object iothread,id=io1 \
      -device virtio-blk-pci,drive=disk0,iothread=io1 \
      -drive if=none,id=disk0,cache=none,format=qcow2,aio=threads,file=disk.qcow2 \
      -drive if=virtio,format=raw,file=seed.img,index=0,media=disk  \
      -bios /usr/share/ovmf/OVMF.fd \
      -usbdevice tablet \
      -vnc $HOST_ADDRESS:$VNC_PORT \
      $@" ENTER
      watch_progress
  fi
}

Environment:

rwmjones commented 1 year ago

I think you are right about qemu-img unfortunately (but see below).

Using https://southfront.mm.fcix.net/fedora/linux/releases/38/Cloud/x86_64/images/Fedora-Cloud-Base-38-1.6.x86_64.qcow2, from a machine in Beaker, I get:

wget: 14.3 nbdkit (old, nbdcopy): 55.3 nbdkit (new, nbdcopy, connections=16): 15.5 nbdkit (new, nbdcopy, connections=32): 9.9 nbdkit (new, nbdcopy, connections=64): 8.2 nbdkit (old, qemu-img): 5m05 nbdkit (new, qemu-img, connections=16): 4m09 nbdkit (new, qemu-img, connections=32): 4m16

Test command:

time nbdkit -r -U - curl 'https://southfront.mm.fcix.net/fedora/linux/releases/38/Cloud/x86_64/images/Fedora-Cloud-Base-38-1.6.x86_64.qcow2' --run 'nbdcopy -p $uri null:' connections=XX

(for qemu tests replace nbdcopy with qemu-img convert ... /tmp/out)

nbdkit has a qcow2 filter now. Using this filter to do the conversion with nbdcopy to do the copy:

nbdkit (new, qcow2dec, nbdcopy, connections=16): 1m25 nbdkit (new, qcow2dec, nbdcopy, connections=32): 41.5 nbdkit (new, qcow2dec, nbdcopy, connections=64): 25.0

wget + qemu-img convert to raw takes 20 seconds, but I didn't spend any time optimizing the qcow2dec filter.

About the command line ...

I would get rid of --filter=cache --filter=readonly as they are unlikely to be doing anything useful with the new filter.

--filter=retry is fine, but --filter=retry-request might be used instead, although for the new curl filter it'll make no difference since all requests are now stateless and handled by a background thread.

-r and --readonly both do the same thing, but it's not a problem to have both.

-U /tmp/nbdkit.sock --pidfile /tmp/nbdkit.pid - placing these files in well-known locations is risky (insecure temporary file vulnerability). I guess this is in a container so nothing else ought to be running, but maybe defence in depth would be better.

awels commented 1 year ago

@cloudymax Just released 1.57.0 which should not require the 'cert' hack shown above. It should be similar speed.