virtio-win / kvm-guest-drivers-windows

Windows paravirtualized drivers for QEMU\KVM
https://www.linux-kvm.org/page/WindowsGuestDrivers
BSD 3-Clause "New" or "Revised" License
2.02k stars 386 forks source link

Crash bug in virtiofs for files over 4M triggered by Backblaze personal backup #776

Closed jmbezekiel closed 1 year ago

jmbezekiel commented 2 years ago

Issue description In v0.1.215-1, v0.1.215-2, v0.1.217-1, and v0.1.217-2 Backblaze client causes the Virtio-FS Service to crash on files > 4190000 bytes (-ish). With v0.1.204, the service survives, but the Backblaze client reported an error of 'TEMPORARY_OTHER' for the file and cannot be backed up.

To Reproduce Attempting to use the Backblaze personal backup client to back up the attached G0011522.JPG from a virtio-fs directory will reliably trigger the failure. The Virtio-FS service crashes, and the Z: drive disappears. The backblaze client treats the situation as if Z: was a USB drive that disconnected, and carries on ignoring the rest of the Z: drive.

Using explorer to copy the file into the VM succeeds. Using Backblaze to back up the file from the copy inside the VM also succeeds.

Expected behavior The service should remain up and going, and the backup client should be able to read the file and back it up. What appears to happen instead is the client successfully reads the file in 1MB chunks (presumably to scan for blocks that need to be backed up), then it comes around for a second pass and load the entire file all at once -- but instead of reading 4190259 bytes, the number of bytes returned on the read is 0, which appears to trigger a fault.

Screenshots The attached files: G0011522.JPG is the file that will cause the failure dir.txt is a listing of the directory of the virtio filesystem. BB backs files up from smallest to largest -- G0011525.JPG succeeds, at 4181721 bytes. G0011522.JPG fails at 4190259 bytes. VFS.log is the logging from virtiofs (DebugFlag set to 0xFFFFFFFF) Dante.xml.txt is the qemu configuration file.

Host:

VM:

Additional context

Dante.xml.txt dir.txt VFS.log G0011522

/usr/bin/qemu-system-x86_64 -name guest=Dante,debug-threads=on -S -object {"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-6-Dante/master-key.aes"} -blockdev {"driver":"file","filename":"/usr/share/OVMF/OVMF_CODE.fd","node-name":"libvirt-pflash0-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-pflash0-format","read-only":true,"driver":"raw","file":"libvirt-pflash0-storage"} -blockdev {"driver":"file","filename":"/var/lib/libvirt/qemu/nvram/Dante_VARS.fd","node-name":"libvirt-pflash1-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-pflash1-format","read-only":false,"driver":"raw","file":"libvirt-pflash1-storage"} -machine pc-q35-4.2,usb=off,vmport=off,dump-guest-core=off,pflash0=libvirt-pflash0-format,pflash1=libvirt-pflash1-format,memory-backend=pc.ram -accel kvm -cpu Cooperlake,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,sha-ni=on,umip=on,waitpkg=on,gfni=on,vaes=on,vpclmulqdq=on,rdpid=on,movdiri=on,movdir64b=on,fsrm=on,md-clear=on,avx-vnni=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,hle=off,rtm=off,avx512f=off,avx512dq=off,avx512cd=off,avx512bw=off,avx512vl=off,avx512vnni=off,avx512-bf16=off,taa-no=off,hv-time=on,hv-relaxed=on,hv-vapic=on,hv-spinlocks=0x1fff -m 8192 -object {"qom-type":"memory-backend-file","id":"pc.ram","mem-path":"/var/lib/libvirt/qemu/ram/libvirt/qemu/6-Dante/pc.ram","share":true,"x-use-canonical-path-for-ramblock-id":false,"size":8589934592} -overcommit mem-lock=off -smp 4,sockets=1,dies=1,cores=4,threads=1 -uuid 095d4640-0dec-45bc-bfca-a527554ad2ff -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=34,server=on,wait=off -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot menu=off,strict=on -device pcie-root-port,port=16,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device pcie-root-port,port=17,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-root-port,port=18,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=19,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device pcie-root-port,port=20,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device pcie-root-port,port=21,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 -device pcie-root-port,port=22,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x6 -device qemu-xhci,p2=15,p3=15,id=usb,bus=pci.2,addr=0x0 -device virtio-serial-pci,id=virtio-serial0,bus=pci.3,addr=0x0 -blockdev {"driver":"host_device","filename":"/dev/ezekiel-00/dante","node-name":"libvirt-3-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-3-format","read-only":false,"driver":"raw","file":"libvirt-3-storage"} -device virtio-blk-pci,bus=pci.6,addr=0x0,drive=libvirt-3-format,id=virtio-disk0,bootindex=2 -blockdev {"driver":"file","filename":"/mnt/iso/template/iso/Windows21H2.iso","node-name":"libvirt-2-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-2-format","read-only":true,"driver":"raw","file":"libvirt-2-storage"} -device ide-cd,bus=ide.0,drive=libvirt-2-format,id=sata0-0-0,bootindex=1 -blockdev {"driver":"file","filename":"/mnt/iso/template/iso/virtio-win-0.1.217-2.iso","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-1-format","read-only":true,"driver":"raw","file":"libvirt-1-storage"} -device ide-cd,bus=ide.1,drive=libvirt-1-format,id=sata0-0-1,bootindex=3 -chardev socket,id=chr-vu-fs0,path=/var/lib/libvirt/qemu/domain-6-Dante/fs0-fs.sock -device vhost-user-fs-pci,ats=on,id=fs0,chardev=chr-vu-fs0,queue-size=1024,tag=Backups,bus=pci.5,addr=0x0 -netdev tap,fd=35,id=hostnet0,vhost=on,vhostfd=37 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:a2:e5:95,bus=pci.1,addr=0x0 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -audiodev {"id":"audio1","driver":"spice"} -spice port=5901,addr=127.0.0.1,disable-ticketing=on,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -device ich9-intel-hda,id=sound0,bus=pcie.0,addr=0x1b -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0,audiodev=audio1 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.4,addr=0x0 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on

jmbezekiel commented 2 years ago

Also.... tried multiple things inside the XML configuration with no effect:

xiagao commented 2 years ago

Tried with virtio-win-prewhql-0.1-217 on Win10-64(21h2) guest, can upload the photo(you provided) to backblaze from Z:(virtiofs shared volume).Also I tried to uploda/download a 488M size file,it works fine. backblaze

host info: qemu-kvm-4.2.0-60.module+el8.5.0+14545 4.18.0-348.23.1.el8_5.x86_64

jmbezekiel commented 2 years ago

Tried with virtio-win-prewhql-0.1-217 on Win10-64(21h2) guest, can upload the photo(you provided) to backblaze from Z:(virtiofs shared volume).Also I tried to uploda/download a 488M size file,it works fine. backblaze

host info: qemu-kvm-4.2.0-60.module+el8.5.0+14545 4.18.0-348.23.1.el8_5.x86_64

Interesting that RH8 didnt have an issue, but Ubuntu 22.04 did. The difference may be as simple as the host kernel. That said... one of the updates somewhere in the past couple months seems to have addressed the issue. I was able to successfully back up the underlying Linux filesystems without a problem. Some of the files were well over 10GB+.

Host:

Distro: Ubuntu 22.04
Kernel version: 5.15.0-47-generic    -- updated kernel version
QEMU version: 6.2.0 (6.2+dfsg-2ubuntu6.3)   -- updated
libvirt version: 8.0 (8.0.0-1ubuntu7.1)   -- updated

VM:

Windows version: 10 - 21H2
Which driver has a problem: Virtio-FS
Driver version: 0.1-221

Thanks for listening. I'm not sure which update actually fixed the issue, but I'm not sure it matters any more, either.

wisdom_of_the_ancients

YanVugenfirer commented 2 years ago

Hi,

What version of virtiofsd is used on Ubuntu?

jmbezekiel commented 2 years ago

Hi,

What version of virtiofsd is used on Ubuntu?

/usr/lib/qemu/virtiofsd --version virtiofsd version 6.2.0 (Debian 1:6.2+dfsg-2ubuntu6.3) Copyright (c) 2003-2021 Fabrice Bellard and the QEMU Project developers using FUSE kernel interface version 7.33

xiagao commented 2 years ago

Also I would like to add my host info. kernel: kernel-4.18.0-348.23.1.el8_5.x86_64 qemu-kvm: qemu-kvm-4.2.0-60.module+el8.5.0+14545+9e40c7b1.2.x86_64 virtiofsd: rpm -qf /usr/libexec/virtiofsd qemu-kvm-core-4.2.0-60.module+el8.5.0+14545+9e40c7b1.2.x86_64

christophocles commented 1 year ago

I was having a similar issue with Backblaze on a virtiofs share. The Z: drive would disappear shortly after starting the Backblaze indexing and I would have to reboot the Windows guest to be able to access it again. I assumed it was a problem with the virtio-win drivers and I tried all the same things that @jmbezekiel did to no avail.

Then I closely monitored the debug log for virtiofsd on the host and I noticed the following:

[2022-11-20 04:22:35.654885+0000] [ID: 00000004] unique: 160215, opcode: READDIRPLUS (44), nodeid: 947806, insize: 80, pid: 5100 [2022-11-20 04:22:35.654889+0000] [ID: 00000004] unique: 160215, error: -24 (Too many open files), outsize: 16

Apparently it was trying to open too many files. I then tried running virtiofsd with --rlimit-nofile=1000000 and that did not help either. Then I tried --inode-file-handles=mandatory and my version of virtiofsd did not support this:

nox:/home/christophocles # /usr/libexec/virtiofsd --version virtiofsd version 7.1.0 (openSUSE Tumbleweed) Copyright (c) 2003-2022 Fabrice Bellard and the QEMU Project developers using FUSE kernel interface version 7.36

So then I compiled the Rust version of virtiofsd from this link https://gitlab.com/virtio-fs/virtiofsd and ran it with --inode-file-handles=mandatory and this finally solved the problem!!

YanVugenfirer commented 1 year ago

I was having a similar issue with Backblaze on a virtiofs share. The Z: drive would disappear shortly after starting the Backblaze indexing and I would have to reboot the Windows guest to be able to access it again. I assumed it was a problem with the virtio-win drivers and I tried all the same things that @jmbezekiel did to no avail.

Then I closely monitored the debug log for virtiofsd on the host and I noticed the following:

[2022-11-20 04:22:35.654885+0000] [ID: 00000004] unique: 160215, opcode: READDIRPLUS (44), nodeid: 947806, insize: 80, pid: 5100 [2022-11-20 04:22:35.654889+0000] [ID: 00000004] unique: 160215, error: -24 (Too many open files), outsize: 16

Apparently it was trying to open too many files. I then tried running virtiofsd with --rlimit-nofile=1000000 and that did not help either. Then I tried --inode-file-handles=mandatory and my version of virtiofsd did not support this:

nox:/home/christophocles # /usr/libexec/virtiofsd --version virtiofsd version 7.1.0 (openSUSE Tumbleweed) Copyright (c) 2003-2022 Fabrice Bellard and the QEMU Project developers using FUSE kernel interface version 7.36

So then I compiled the Rust version of virtiofsd from this link https://gitlab.com/virtio-fs/virtiofsd and ran it with --inode-file-handles=mandatory and this finally solved the problem!!

Hi @christophocles,

Thank you for sharing. Rust version is the one that gets the latest fixes and should be used going forward.

Best regards, Yan.

YanVugenfirer commented 1 year ago

Rust virtiofsd is a default version. Closing.