virtio-win / kvm-guest-drivers-windows

Windows paravirtualized drivers for QEMU\KVM
https://www.linux-kvm.org/page/WindowsGuestDrivers
BSD 3-Clause "New" or "Revised" License
1.99k stars 382 forks source link

Virtio scsi windows drivers from flexvdi are ten times faster #428

Open mgiammarco opened 4 years ago

mgiammarco commented 4 years ago

Hello, I am trying windows VM on Proxmox (kvm based). I have assembled recently several new clusters with ssd disks and nvme disks. Unfortunately windows VM disk and scsi virtio i/o performance seemed to me quite low. I have tried fedora 0.1.171 and 0.1.173 virtio drivers. Then I tried flexvdi virtio drivers which seems a (32 bit!!!) fork of 2018 drivers 0.1.160 Now I tried with crystal disk mark and empirically I get 10x speed improvement. I have also tried original 0.1.160 but I have not seen any improvement. Is it a bug? Can you help me? Can someone test this behaviour and tell me I am not crazy? Thanks, Mario

vrozenfe commented 4 years ago

Hi Mario, Do you see the same level of difference for both sequential and random reads/writes? Or even better, can you post the performance results for both virtio-win and flexvdi drivers?

Thanks, Vadim.

mgiammarco commented 4 years ago

Hello, all tests are made on:

Windows 10 with flex vdi on nvme:

13813 12864 1850 1690 344 397 18 17

Windows 10 with flex vdi on ssd:

15118 10002 2238 1768 412 335 25 22

Window 10 with standard virtio on ssd:

2346 2354 1070 1018 19 18 7 6

windows 2019 server standard virtio on ssd:

5803 5206 1038 432 43 38 5 6

Another server with proxmox similar options but dell based, SAS disks, windows 2012 and old virtio:

8906 1807 3912 1358 328 280 63 58

mgiammarco commented 4 years ago

It is shocking for me that dell server has only gigabit ethernet for ceph and it reaches 63 58 in strictly sequential test (where latency matters most)

mgiammarco commented 4 years ago

And I would like to know if you have some benchmarks too to compare

vrozenfe commented 4 years ago

I usual use IoMeter for heavy performance testing, but decided to give a try to CrystalDiskMark this time :)

My testing system is like following: Kernel: 5.3.16-300.fc31.x86_64 QEMU: 4.1.50v4.1.0-2287-gd0f90e1423

Quest: Windows 10 Enterprise 32bit Version: 10.0.18362 QEMU command line sudo $QEMU -cpu host$FLGS -m 2G -smp 4,maxcpus=4,cores=4,threads=1,sockets=1 -usb -device usb-tablet,id=tablet0 -drive file=$IMG,if=none,id=drive-ide0-0-0,cache=off,werror=stop,rerror=stop -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -netdev tap,id=hostnet0 -device e1000,netdev=hostnet0,mac=52:83:66:77:88:68,id=net0 -boot c -uuid 5b959a71-e33f-4419-97b4-da6fe8fb7062 -rtc driftfix=slew -global kvm-pit.lost_tick_policy=discard -monitor stdio -name TRIM -enable-kvm -vga std $SER -object iothread,id=iothread1 -drive file=$DSK,if=none,media=disk,rerror=stop,werror=stop,readonly=off,cache=none,aio=native,id=drive-hotadd,discard=unmap -device virtio-scsi-pci,id=scsi-hotadd,iothread=iothread1,hotplug=off,packed=off,num_queues=4 -device scsi-disk,drive=drive-hotadd,id=hotadd,bus=scsi-hotadd.0,bootindex=-1,serial=px_data

(Please note, that the above command line is not optimal in terms of performance testing, mostly because I was using HPET as the time stamp source that time. Turning on Hyper-V time stamp source should give a way better results).

Here are the results:

On top of NVMe disk ([N:0:2:1] disk Samsung SSD 960 EVO 250GB__1 /dev/nvme0n1)


CrystalDiskMark 7.0.0 (C) 2007-2019 hiyohiyo Crystal Dew World: https://crystalmark.info/

[Read] Sequential 1MiB (Q= 8, T= 1): 1690.511 MB/s [ 1612.2 IOPS] < 4939.29 us> Sequential 1MiB (Q= 1, T= 1): 850.461 MB/s [ 811.1 IOPS] < 1212.52 us> Random 4KiB (Q= 32, T=16): 178.585 MB/s [ 43599.9 IOPS] < 11590.97 us> Random 4KiB (Q= 1, T= 1): 15.677 MB/s [ 3827.4 IOPS] < 250.32 us>

[Write] Sequential 1MiB (Q= 8, T= 1): 1490.192 MB/s [ 1421.2 IOPS] < 5593.70 us> Sequential 1MiB (Q= 1, T= 1): 823.662 MB/s [ 785.5 IOPS] < 1249.83 us> Random 4KiB (Q= 32, T=16): 173.058 MB/s [ 42250.5 IOPS] < 11793.66 us> Random 4KiB (Q= 1, T= 1): 21.018 MB/s [ 5131.3 IOPS] < 183.30 us>

Profile: Default Test: 512 MiB (x5) [Interval: 5 sec] Date: 2020/01/07 5:20:50 OS: Windows 10 Enterprise [10.0 Build 18362] (x86)

On top of ram disk ([5:0:0:0] disk Linux scsi_debug 0188 /dev/sdd )


CrystalDiskMark 7.0.0 (C) 2007-2019 hiyohiyo Crystal Dew World: https://crystalmark.info/

[Read] Sequential 1MiB (Q= 8, T= 1): 5492.204 MB/s [ 5237.8 IOPS] < 1506.27 us> Sequential 1MiB (Q= 1, T= 1): 1642.436 MB/s [ 1566.3 IOPS] < 622.51 us> Random 4KiB (Q= 32, T=16): 181.472 MB/s [ 44304.7 IOPS] < 11327.30 us> Random 4KiB (Q= 1, T= 1): 20.689 MB/s [ 5051.0 IOPS] < 185.60 us>

[Write] Sequential 1MiB (Q= 8, T= 1): 6170.710 MB/s [ 5884.8 IOPS] < 1337.39 us> Sequential 1MiB (Q= 1, T= 1): 1591.523 MB/s [ 1517.8 IOPS] < 641.85 us> Random 4KiB (Q= 32, T=16): 175.151 MB/s [ 42761.5 IOPS] < 11763.70 us> Random 4KiB (Q= 1, T= 1): 21.155 MB/s [ 5164.8 IOPS] < 181.89 us>

Profile: Default Test: 512 MiB (x5) [Interval: 5 sec] Date: 2020/01/07 3:41:30 OS: Windows 10 Enterprise [10.0 Build 18362] (x86)

The best results that I got with IoMeter on the same configuration is

On top of ram disk Write 7316.065516 MB/s 1MB blocks, 4 Workers, and Queue Depth is 2 Read 4967.842813 MB/s 1MB blocks, 4 Workers, and Queue Depth is 16

Best, Vadim.

vrozenfe commented 4 years ago

Some more results with HV time source turned on :

On top of ram disk:

CrystalDiskMark 7.0.0 (C) 2007-2019 hiyohiyo Crystal Dew World: https://crystalmark.info/

[Read] Sequential 1MiB (Q= 8, T= 1): 5585.875 MB/s [ 5327.1 IOPS] < 1390.62 us> Sequential 1MiB (Q= 1, T= 1): 2464.481 MB/s [ 2350.3 IOPS] < 423.48 us> Random 4KiB (Q= 32, T=16): 170.579 MB/s [ 41645.3 IOPS] < 11882.46 us> Random 4KiB (Q= 1, T= 1): 19.527 MB/s [ 4767.3 IOPS] < 207.96 us>

[Write] Sequential 1MiB (Q= 8, T= 1): 6226.033 MB/s [ 5937.6 IOPS] < 1341.79 us> Sequential 1MiB (Q= 1, T= 1): 2374.059 MB/s [ 2264.1 IOPS] < 439.53 us> Random 4KiB (Q= 32, T=16): 346.712 MB/s [ 84646.5 IOPS] < 6032.92 us> Random 4KiB (Q= 1, T= 1): 16.434 MB/s [ 4012.2 IOPS] < 144.81 us>

Profile: Default Test: 512 MiB (x5) [Interval: 5 sec] Date: 2020/01/07 8:55:53 OS: Windows 10 Enterprise [10.0 Build 18362] (x86)

on top of nvme back-end:

CrystalDiskMark 7.0.0 (C) 2007-2019 hiyohiyo Crystal Dew World: https://crystalmark.info/

[Read] Sequential 1MiB (Q= 8, T= 1): 1709.249 MB/s [ 1630.1 IOPS] < 4900.59 us> Sequential 1MiB (Q= 1, T= 1): 1037.930 MB/s [ 989.8 IOPS] < 1007.35 us> Random 4KiB (Q= 32, T=16): 311.522 MB/s [ 76055.2 IOPS] < 6684.45 us> Random 4KiB (Q= 1, T= 1): 17.449 MB/s [ 4260.0 IOPS] < 232.72 us>

[Write] Sequential 1MiB (Q= 8, T= 1): 1493.185 MB/s [ 1424.0 IOPS] < 5597.29 us> Sequential 1MiB (Q= 1, T= 1): 987.709 MB/s [ 942.0 IOPS] < 1058.33 us> Random 4KiB (Q= 32, T=16): 395.419 MB/s [ 96537.8 IOPS] < 5283.69 us> Random 4KiB (Q= 1, T= 1): 26.880 MB/s [ 6562.5 IOPS] < 150.75 us>

Profile: Default Test: 512 MiB (x5) [Interval: 5 sec] Date: 2020/01/07 8:47:12 OS: Windows 10 Enterprise [10.0 Build 18362] (x86)

Best, Vadim.