Open Ivan-Velickovic opened 5 days ago
I have seen this issue consistently with certain schedules when trying to use the domain scheduler with microkit and libvmm. I built a microkit with these branches: https://github.com/JE-Archer/seL4/tree/microkit and https://github.com/JE-Archer/microkit/tree/domains.
Then I created a domain schedule for the libvmm simple example for qemu_virt_aarch64: https://github.com/Furao/libvmm/tree/domain
This consistently causes the issue posted here after the VM gets stuck waiting on timer IRQs for a couple of minutes with no output and then it starts printing the errors you posted. I think the main problem is that the VM's VCPU TCB was not getting set to any domains. I created this patch to set the VMs VCPU TCB to the same domain as the PD that contains it: https://github.com/JE-Archer/microkit/pull/1
This seems to fix the issue. I tested on QEMU and the ZCU102 with the same schedule.
In the past when the system has been under high load and the guest/VMM hasn't had a chance to run in a while errors like the following can show up:
I can't tell if this is a bug with the VMM or with the kernel since a similar case where the guest hasn't had a chance to run in a while is setting a budget less than the period.
However, this leads to a kernel assertion right now: https://github.com/seL4/seL4/issues/1104.
More investigation required, will post updates to this issue.