virtio-win / kvm-guest-drivers-windows

Windows paravirtualized drivers for QEMU\KVM
https://www.linux-kvm.org/page/WindowsGuestDrivers
BSD 3-Clause "New" or "Revised" License
2.05k stars 387 forks source link

Windows 10 NetKVM problems #482

Closed Onepamopa closed 2 years ago

Onepamopa commented 4 years ago

Hello,

I'm experiencing problems with NetKVM - the network is working, however, Windows tray shows "No internet" after some time.

The only "fix" is to "reset" the network adapter, which makes it disappear completely and needs a reboot. After reboot, the network is detected normally again for a few hours, or a day, until it again starts displaying "No internet".

I've tried the following "releases" from https://fedorapeople.org/groups/virt/virtio-win/direct-downloads/archive-virtio/ 0.1.171 0.1.173 0.1.173-9 0.1.185 0.1.185-2

The problem appears in all of them.

If you need additional information - I'll be happy to provide it.

ybendito commented 4 years ago

@Onepamopa First of all, what is your host system and what is exact command line of qemu? Note that this does not matter which driver build >= 160 to use.

Onepamopa commented 4 years ago

Host is Proxmox (PVE), here is the command line of the VM in mention:

/usr/bin/kvm -id 100 -name Windows-VM -chardev socket,id=qmp,path=/var/run/qemu-server/100.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5 -mon chardev=qmp-event,mode=control -pidfile /var/run/qemu-server/100.pid -daemonize -smbios type=1,uuid=d8dd0364-8f2f-4031-aa2e-bd12e7dbfd24 -drive if=pflash,unit=0,format=raw,readonly,file=/usr/share/pve-edk2-firmware//OVMF_CODE.fd -drive if=pflash,unit=1,format=raw,id=drive-efidisk0,size=131072,file=/dev/NVME/vm-100-disk-0 -smp 16,sockets=1,cores=16,maxcpus=16 -nodefaults -boot menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg -vga none -nographic -no-hpet -cpu host,+aes,+hv-tlbflush,hv_ipi,hv_relaxed,hv_reset,hv_runtime,hv_spinlocks=0x1fff,hv_stimer,hv_synic,hv_time,hv_vapic,hv_vendor_id=proxmox,hv_vpindex,+ibpb,kvm=off,+kvm_pv_eoi,+kvm_pv_unhalt,+pdpe1gb -m 32768 -object iothread,id=iothread-virtio0 -readconfig /usr/share/qemu-server/pve-q35-4.0.cfg -device vmgenid,guid=ce75b6bd-2304-4035-8987-dc11af2a0456 -device nec-usb-xhci,id=xhci,bus=pci.1,addr=0x1b -device usb-tablet,id=tablet,bus=ehci.0,port=1 -device vfio-pci,host=0000:01:00.0,id=hostpci0.0,bus=ich9-pcie-port-1,addr=0x0.0,multifunction=on,romfile=/usr/share/kvm/GP104-1070-KVM.rom -device vfio-pci,host=0000:01:00.1,id=hostpci0.1,bus=ich9-pcie-port-1,addr=0x0.1 -device vfio-pci,host=0000:44:00.1,id=hostpci1,bus=ich9-pcie-port-2,addr=0x0,rombar=0 -device vfio-pci,host=0000:44:00.3,id=hostpci2,bus=ich9-pcie-port-3,addr=0x0,rombar=0 -device usb-host,bus=xhci.0,hostbus=5,hostport=5.3,id=usb1 -chardev socket,path=/var/run/qemu-server/100.qga,server,nowait,id=qga0 -device virtio-serial,id=qga0,bus=pci.0,addr=0x8 -device virtserialport,chardev=qga0,name=org.qemu.guest_agent.0 -object rng-random,filename=/dev/hwrng,id=rng0 -device virtio-rng-pci,rng=rng0,max-bytes=8192,period=500,bus=pci.1,addr=0x1d -iscsi initiator-name=iqn.1993-08.org.debian:01:29961e2a492a -drive file=/var/lib/vz/template/iso/virtio-win-0.1.171.iso,if=none,id=drive-ide2,media=cdrom,aio=threads -device ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2 -device virtio-scsi-pci,id=virtioscsi2,bus=pci.3,addr=0x3 -drive file=/dev/disk/by-id/ata-ST5000DM000-1FK178_W4J04EPK,if=none,id=drive-scsi2,format=raw,cache=none,aio=native,detect-zeroes=on -device scsi-hd,bus=virtioscsi2.0,channel=0,scsi-id=0,lun=2,drive=drive-scsi2,id=scsi2 -drive file=/dev/NVME/vm-100-disk-1,if=none,id=drive-virtio0,format=raw,cache=none,aio=native,detect-zeroes=on -device virtio-blk-pci,drive=drive-virtio0,id=virtio0,bus=pci.0,addr=0xa,iothread=iothread-virtio0,bootindex=101 -netdev type=tap,id=net0,ifname=tap100i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on,queues=8 -device virtio-net-pci,mac=EA:B1:E9:48:A3:DE,netdev=net0,bus=pci.0,addr=0x12,id=net0,vectors=18,mq=on -rtc driftfix=slew,base=localtime -machine type=q35+pve0 -global kvm-pit.lost_tick_policy=discard

ybendito commented 4 years ago

If I'm not mistaken some problems were reported for proxmox (you can search in closed issues). We never test the driver with it, only with qemu. It looks like at some stage the kernel stops returning transmitted packets that present serious problem for entire network stack. I suggest to try turning off vhost for virtio-net.

Onepamopa commented 4 years ago

The problem is .... there is internet, the "kernel" doesn't stop transmitting packets. The windows itself "detects" no internet, when in fact there is a fully-working connection. This makes me think it's a driver issue. I don't know how the drivers are tested, but I'm using a windows VM as a daily driver (meaning its only restarted when necessary (windows updates, etc)), and this happens sometimes after few hours, sometimes - few days, but it always happens and I've no choice but to restart the VM because some applications "think" there is no internet just because windows itself displays that in the tray.

ybendito commented 4 years ago

The only "fix" is to "reset" the network adapter, which makes it disappear completely and needs a reboot

What exactly this means? Is it possible to disable the netkvm adapter via device manager? If yes, this is not a problem of returned packets. Does vhost=off makes some difference?

Onepamopa commented 4 years ago

Disable/Enable the network adapter --- does nothing, windows tray still shows the "planet" as "no internet", even tho.. there is a working connection, I'm writing this from the VM in mention.

The only "fix" is "reset" the adapter via troubleshooter - which forces the PC to be restarted (the virtio network adapter disappears from "network and sharing center" after reset from troubleshooter).

I have no idea how to set "vhost=off" on the network adapter, this is the PVE configuration for it:

Screenshot_2020-08-05 proxmox - Proxmox Virtual Environment

virtio

VM uptime is 2 days, 20 hours 57 minutes, the "no internet" appeared last night, so roughly after 2 days 10 hours uptime. Sometimes it takes longer, sometimes it takes a few hours...

I can also pass-through a physical NIC if you want me to test if this happens on it, but I'm pretty sure it won't.

Onepamopa commented 4 years ago

Okay, managed to get it working again w/o reset/reboot. Will do some more tests before report.

Onepamopa commented 4 years ago

Nop, so far the only "fix" is this: no_inet_fix_reboot

and a reboot

ybendito commented 4 years ago

@Onepamopa We do not have any expert in proxmox. So I would suggest to check whether the same thing happens with qemu. I also suggested to turn vhost off (although have no idea how exactly you do it with proxmox, probably there should be some manual for advanced profile editing). Another thing is that this is not for sure problem of proxmox. I see, for example, https://answers.microsoft.com/en-us/windows/forum/all/windows-shows-no-internet-access-but-my-internet/2e9b593f-c31c-4448-b5d9-6e6b2bd8560c

aderumier commented 4 years ago

@Onepamopa

you can manually disable vhost with editing

/usr/share/perl5/PVE/QemuServer.pm

    if (is_native($arch)) {
        $vhostparam = ',vhost=on' if kernel_has_vhost_net() && $net->{model} eq 'virtio';
    }

(replace on with off)

then "systemctl restart pvedaemon"

and stop/start the vm.

(proxmox kernel is ubuntu kernel 5.4, don't known if t's a vhost bug or not...)

Onepamopa commented 4 years ago

Actually, from virtio 0.1.189 everything looks fine (so far).

Onepamopa commented 4 years ago

And @aderumier, as I said - winblows detects "no internet" after X days, but there's fully working connection - I can keep using it w/o reboot, its just that I can't use netflix or weather (some apps check with the windows "internet or not" setting....). This is why I think it's a driver issue. Let's see if virtio 0.1.189 solves that (so far it does).

LacazeThomas commented 4 years ago

I go the same problem of @Onepamopa. I lost internet with the last driver 0.1.189 and the 3 lastests drivers. The only fix is to reboot Windows.

I tried to disable vhost but the dir PVE doesn't exist in /usr/share/perl5/

I can provider you other details, thanks

andaag commented 3 years ago

I've been having this issue too, it's a bit hard to reliably reproduce, but I've been running a vm for a while now without issues. (of course me posting this and thinking I'm done with it - it'll probably break tomorrow)

I think it was related to changing clock setup.

I went from:

  <clock offset='localtime'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
    <timer name='tsc' present='yes' mode='native'/>
    <timer name='hypervclock' present='yes'/>
  </clock>

to:

  <clock offset='localtime'>
    <timer name='hpet' present='yes'/>
    <timer name='tsc' present='yes' mode='native'/>
    <timer name='hypervclock' present='yes'/>
  </clock>
Onepamopa commented 3 years ago

My issue was related to using my own DNS server (adblock-related) on 127.0.0.1 configured as such on the interface, as soon as I switched to something else (anything else, really - I'm using the same ad-blocking DNS but on 192.168.0.100 not 127.0.0.1) - problem disappeared.

YanVugenfirer commented 3 years ago

I've been having this issue too, it's a bit hard to reliably reproduce, but I've been running a vm for a while now without issues. (of course me posting this and thinking I'm done with it - it'll probably break tomorrow)

I think it was related to changing clock setup.

I went from:

  <clock offset='localtime'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
    <timer name='tsc' present='yes' mode='native'/>
    <timer name='hypervclock' present='yes'/>
  </clock>

to:

  <clock offset='localtime'>
    <timer name='hpet' present='yes'/>
    <timer name='tsc' present='yes' mode='native'/>
    <timer name='hypervclock' present='yes'/>
  </clock>

@vrozenfe Can you please advise on the optimal timer settings for VM?

Thanks, Yan.

vrozenfe commented 3 years ago

Mostly depends on Windows version. Combination on hv_time + rtc + no-hpet was giving the best results for pre-1803 Win10. For 1803 (and I guess for the more recent versions) rtc is not optimal any more https://bugzilla.redhat.com/show_bug.cgi?id=1610461#c39
If possible, you should consider switching to hv_stime, rather than RTC or HPET.

Best, Vadim.

andaag commented 3 years ago

So for recently updated windows version we should be running with

<timer name='hpet' present='no'/>

<synic state='on'/>
<stimer state='on'/>

?

(I'm guessing synic and stimer translates into hv_stime, I haven't found concretely how the xml maps into the qemu args for every option..)

vrozenfe commented 3 years ago

I'm not a libvirt person, but I guess you are right. You can check it with "ps -aux" after all.

Best, Vadim.