Open eroussy opened 3 months ago
Here are my investigations so far:
rtVM vCPUs are running with rt priority on cores 4 and 5. By taking a look at what's also running on these cores, I find:
root@seapath:~# ps -eTo comm,tid,pid,cls,pri,%cpu,psr | grep "[4,5]$"
[..] (Linux core management threads)
kworker/3:3-eve 195349 195349 TS 19 0.0 5
qemu-system-x86 197242 197242 TS 19 87.3 4
call_rcu 197265 197242 TS 19 0.0 4
worker 197266 197242 TS 19 0.0 4
vhost-197242 197268 197242 TS 19 0.0 4
IO mon_iothread 197269 197242 TS 19 0.2 4
CPU 0/KVM 197270 197242 FF 41 87.1 4
CPU 1/KVM 197271 197242 FF 41 0.0 5
worker 197398 197242 FF 41 0.0 4
kvm-nx-lpage-re 197267 197267 TS 19 0.0 4
The qemu-system-x86 core is taking too much CPU, so why does it not move to another core ?
root@seapath:~# taskset -cp 197242 #qemu-system
pid 197242's current affinity list: 4-7
The affinity list allows it to move, so why does the scheduler not put it on another core ? I don't know if it's a libvirt bug or a SEAPATH configuration problem.
We can control the management thread of the VM with emulatorpin in libvirt. This can be done either in the xml :
<emulatorpin cpuset='6,7'/>
Or directly on the target with the command virsh emulatorpin rtVM 6,7
Both of these commands technically solve the problem:
But it shouldn't be mandatory to specify this.
Describe the bug When deploying an RT and isolated VM, if the core chosen to isolate the VM is the first of the machine-rt slice, the VM will never boot. The associated qemu-system-x86 thread will take 100% of one CPU forever.
To Reproduce
virsh console rtVM
Allowed CPUs in my Ansible inventory :
My RT VM inventory
Expected behavior The VM must boot. The qemu-system-x86 will take 100% of one CPU, but just for a few seconds.
Additional context
On the hypervisor:
The
qemu-system-x86
thread responsible to manage the VM is always running on the first allowed CPU (here the 4th). The VM's vCPU is also pinned on this CPU. I think the two threads will interrupt each other and prevent the VM to boot.Also, first lines of the
top
command on the hypervisor:The qemu-system-x86 thread is taking 100% of the CPU.