Closed cloudymax closed 1 year ago
I think you are right about qemu-img unfortunately (but see below).
Using https://southfront.mm.fcix.net/fedora/linux/releases/38/Cloud/x86_64/images/Fedora-Cloud-Base-38-1.6.x86_64.qcow2, from a machine in Beaker, I get:
wget: 14.3 nbdkit (old, nbdcopy): 55.3 nbdkit (new, nbdcopy, connections=16): 15.5 nbdkit (new, nbdcopy, connections=32): 9.9 nbdkit (new, nbdcopy, connections=64): 8.2 nbdkit (old, qemu-img): 5m05 nbdkit (new, qemu-img, connections=16): 4m09 nbdkit (new, qemu-img, connections=32): 4m16
Test command:
time nbdkit -r -U - curl 'https://southfront.mm.fcix.net/fedora/linux/releases/38/Cloud/x86_64/images/Fedora-Cloud-Base-38-1.6.x86_64.qcow2' --run 'nbdcopy -p $uri null:' connections=XX
(for qemu tests replace nbdcopy with qemu-img convert ... /tmp/out
)
nbdkit has a qcow2 filter now. Using this filter to do the conversion with nbdcopy to do the copy:
nbdkit (new, qcow2dec, nbdcopy, connections=16): 1m25 nbdkit (new, qcow2dec, nbdcopy, connections=32): 41.5 nbdkit (new, qcow2dec, nbdcopy, connections=64): 25.0
wget + qemu-img convert to raw takes 20 seconds, but I didn't spend any time optimizing the qcow2dec filter.
About the command line ...
I would get rid of --filter=cache --filter=readonly
as they are unlikely to be doing anything useful with the new filter.
--filter=retry
is fine, but --filter=retry-request
might be used instead, although for the new curl filter it'll make no difference since all requests are now stateless and handled by a background thread.
-r
and --readonly
both do the same thing, but it's not a problem to have both.
-U /tmp/nbdkit.sock --pidfile /tmp/nbdkit.pid
- placing these files in well-known locations is risky (insecure temporary file vulnerability). I guess this is in a container so nothing else ought to be running, but maybe defence in depth would be better.
@cloudymax Just released 1.57.0 which should not require the 'cert' hack shown above. It should be similar speed.
What happened: There is a massive performance variance between CDI image import methods which makes the declarative workflow for PVCs/DataVolumes unusable. I'm REALLY trying to make Kubevirt work as a replacement for qemu wrapped in bash but the slowness of drive creation is making it impossible to achieve buy-in from others.
You can find my full setup here: https://github.com/cloudymax/argocd-nvidia-lab/tree/main/kubevirt
Disk configs are specifically here: https://github.com/small-hack/argocd/tree/main/kubevirt/disks
CDI configuration is here: https://github.com/small-hack/argocd/tree/main/kubevirt/cdi
This has been tested across multiple hardware types:
Hetzner AX102 Instance with R97950X3D + NVME (Raid0)
Dell XPS with i7-11700 + NVME (no raid)
Hetzner instance with i7 7700 + SATA6 SSDs (Raid0)
Has been tested on K3s (bare-metal and in a KVM VPS)
Has been tested with local-path and Longhorn storage classes
The CDI has been tested with nodePort and LoadBalancer type
I have increased the resources available to the CDI importer but it had no effect, the process will not consume more resources.
Method: wget + virtctl Storage Class: local-path Time: 22 seconds
Method: wget + virtctl Storage Class: longhorn Time: 56 seconds
Command used:
When using the declarative method the process takes over 20 minutes
Logs attached: Explore-logs-2023-07-15 10 36 08.txt
What you expected to happen:
I would expect this process to be MUCH faster. For example, when performing this task using just plain QEMU + Bash it takes only a couple seconds, and the VM is booted and ready to login in ~30 seconds total on the same hardware mentioned above.
Source: https://github.com/cloudymax/Scrap-Metal/blob/main/virtual-machines/vm.sh
Environment:
kubectl get deployments cdi-deployment -o yaml
): v1.54.3kubectl version
): v1.27.3+k3s1uname -a
): 6.1 and 5.15