64kramsystem / qemu-pinning

My QEMU fork with pinning (affinity) support and a few tweaks.
Other
40 stars 13 forks source link

Problem pinning CPU threads #2

Closed Saren-Arterius closed 7 years ago

Saren-Arterius commented 7 years ago

It seems that I can't pin threads as well.

For example, the smp config is 4c8t and pinned 4 vcpus. I use windows 10 and cinebench r15 to test.

In windows task manager, if the affiliation of r15 is one of cpu 0-3, it does reflect in host's htop, the corresponding cpu core will get 100% usage. If the affiliation is one of cpu 4-7, then the cpu usage scatter to all host's cpu threads. The guest's cpu threads are not pinned to host's cpu threads at all.

And it seems that multi socket (numa) is a bigger problem, I don't expect doing that using such a simple patch. Perhaps I should move to libvirt at all?

64kramsystem commented 7 years ago

Can you post (or email, if privacy is a concern):

Saren-Arterius commented 7 years ago

/proc/cpuinfo

smp being sockets=1,cores=18,threads=2 ping being -vcpu vcpunum=0,affinity=0 -vcpu vcpunum=1,affinity=1 -vcpu vcpunum=2,affinity=2 -vcpu vcpunum=3,affinity=3 -vcpu vcpunum=4,affinity=4 -vcpu vcpunum=5,affinity=5 -vcpu vcpunum=6,affinity=6 -vcpu vcpunum=7,affinity=7 -vcpu vcpunum=8,affinity=8 -vcpu vcpunum=9,affinity=9 -vcpu vcpunum=10,affinity=10 -vcpu vcpunum=11,affinity=11 -vcpu vcpunum=12,affinity=12 -vcpu vcpunum=13,affinity=13 -vcpu vcpunum=14,affinity=14 -vcpu vcpunum=15,affinity=15 -vcpu vcpunum=16,affinity=16 -vcpu vcpunum=17,affinity=17

sudo ${AUDIO_OPTS} ${aff} qemu-system-x86_64 \
  -name Windows \
  -enable-kvm \
  -machine type=pc,accel=kvm \
  -cpu host,kvm=off,hv_time,hv_relaxed,hv_spinlocks=0x1fff,hv_vpindex,hv_reset,hv_runtime,hv_crash,hv_vendor_id=fuck-ms \
  -smp ${smp} \
  ${pin}
  -m ${mem} \
  -mem-path /dev/hugepages \
  -soundhw hda \
  -usb \
  ${lightpack} \
  -monitor stdio \
  -vga none \
  -usbdevice host:04a9:1747 \
  -usbdevice host:1004:633e \
  -usbdevice host:045e:028e \
  -device ich9-usb-uhci3,id=uhci \
  -device usb-ehci,id=ehci \
  -device nec-usb-xhci,id=xhci \
  -device virtio-scsi-pci,id=scsi \
  -netdev tap,id=t0,ifname=tap0,script=no,downscript=no,vhost=on \
  -net nic,model=virtio,netdev=t0,id=nic0,macaddr=52:54:00:00:00:01 \
  -rtc base=localtime \
  -boot order=c \
  -drive ${drive} \
  -drive if=virtio,id=drive1,file=/media/d9375f8f-423f-47a8-988f-dd0990c93109/Documents/raw-images/fallback-drive.raw,format=raw,cache=writeback \
  -drive if=pflash,format=raw,file=/media/d9375f8f-423f-47a8-988f-dd0990c93109/Documents/Windows_ovmf_vars_x64.bin \
  -device vfio-pci,host=03:00.0,multifunction=on${GPU_ROMFILE_OPT} \
  -device vfio-pci,host=03:00.1${GPU_ROMFILE_OPT}
64kramsystem commented 7 years ago

Thanks! I will investigate this over the next (solar) week.

64kramsystem commented 7 years ago

So!

I think the problem of this is the semantical difference of the CPU-related terms.

The Linux API used by the patch (CPU_SET(3) uses the term CPU, but I think it does not represent a physical CPU, rather, the abstraction of a computing unit made available to the O/S, which, with Hyperthreading, is a hardware thread.

If this guess is correct, the options you're using:

cores=18,threads=2
vcpu vcpunum=0,affinity=0 ... vcpu vcpunum=17,affinity=17

is wrong, because it makes available 36 computing units to the guest, but the pinning is only assigning 17 of them.

now, in order to verify this, first, you should run QEMU with those options:

cores=18,threads=1
vcpu vcpunum=0,affinity=0 ... vcpu vcpunum=17,affinity=17

and verify that the pinning works as expected; once this is done, we can proceed to figure out how to make assignment work correctly down to the hardware thread level.

64kramsystem commented 7 years ago

OK, I think the patch as it is is not sufficient to fully support hyperthreading. If it would, this combination would work:

-smp sockets=1,cores=1,threads=2
-vcpu vcpunum=0,affinity=0 -vcpu vcpunum=1,affinity=1

but the code complains that no more than 1 VCPU is allowed (which I don't think is correct, as with two threads, there should be 2 processors).

Testing the code suggested in the previous comment:

cores=18,threads=1
vcpu vcpunum=0,affinity=0 ... vcpu vcpunum=17,affinity=17

will still verify that without threads, the patch works correctly.

64kramsystem commented 7 years ago

@Saren-Arterius I had a look today at the patch, and fixed both the max vcpus problem, and added support for threads (and hopefully, sockets); I've performed the test using Cinebench. You can read the up to date instructions on the README.md.

64kramsystem commented 7 years ago

Closed by the 2017-10-31 patch.