Xen-related performance problems

DemiMarie commented 2 years ago

How to file a helpful issue

Qubes OS release

R4.1

Brief summary

The Xen hypervisor has performance problems on certain compute-intensive workloads

Steps to reproduce

See @fepitre for details

Expected behavior

Same (or almost same) performance as bare hardware

Actual behavior

Less performance than bare hardware

AlxHnr commented 2 years ago

I can confirm that this problem is clearly noticeable. My Fedora AppVM with up to 8 GB of memory and all cores assigned to it, runs much slower than a native Fedora install on bare metal. Compiling code takes roughly twice a long. I first blamed the lack of hyperthreading, but passing smt=on sched-gran=core to xen makes no difference in my benchmarks (i7-3632qm):

Benchmark	Slowdown compared to native Fedora
`sysbench cpu run`	60%
`sysbench memory run`	1680%
Building some random C++ projects	58%

Watching 1080p@60fps YouTube videos on Qubes OS boils my laptop and doesn't feel like 60fps. Native Fedora handles this much better, despite using only CPU-based video decoders[1].

Startup latency of apps on Qubes OS is much worse. qvm-run personal echo "hello world" takes almost an entire second.

[1] I'm not 100% sure it's CPU-only. But according to htop it hits all my cores really hard, while still being smoother and much cooler than Qubes OS.

DemiMarie commented 2 years ago

Startup latency of apps on Qubes OS is much worse. qvm-run personal echo "hello world" takes almost an entire second.

This is definitely a problem and we’re working on it. It won’t ever be as fast as on bare hardware, but our goal is to go from qvm-run command to the VM spawning init in under 100ms. Runtime CPU performance should be within a few percent of bare silicon, so that it is not is definitely a bug.

I can confirm that this problem is clearly noticeable. My Fedora AppVM with up to 8 GB of memory and all cores assigned to it, runs much slower than a native Fedora install on bare metal. Compiling code takes roughly twice a long. I first blamed the lack of hyperthreading, but passing smt=on sched-gran=core to xen makes no difference in my benchmarks (i7-3632qm):

I suggest reverting this, as it is not security supported upstream. The fact that it did not help your benchmarks indicates that it is not likely to be the culprit.

Benchmark Slowdown compared to native Fedora

sysbench cpu run 60%

sysbench memory run 1680%

Building some random C++ projects 58%

Yeah that’s not good. For clarification: if native Fedora takes time X to build C++ projects, does this mean Qubes OS takes (X / (1 - 0.58)) time? If you could post the raw benchmark data, that would be very helpful.

sysbench memory run is particularly concerning. Does turning off memory balancing for the qube help?

crat0z commented 2 years ago

I can't exactly reproduce this. More information on the workload is needed I think. In my test, I used Linpack Xtreme. On Qubes, I have SMT enabled, 12 vCPUs in my case and 12GB of RAM assigned to the VM (although linpack only uses about 2GB). The VM is PVH with memory balancing enabled. Everything else default for Xen and kernel-latest. Benchmark VM is the only VM running while testing. CPU frequency started at about 3.5GHz and went down to ~2.1GHz over duration of test. My result was 100 GFlops.

Then I started a fedora 36 iso. However, CPU frequency started at 3.9GHz and went down to about 2.4 or 2.5. My result was 112 GFlops.

Perhaps your CPU is not boosting correctly like mine does?

AlxHnr commented 2 years ago

sysbench memory run is particularly concerning. Does turning off memory balancing for the qube help?

I turned off memory balancing via Qubes Manager and assigned fixed 10 GB of memory to the Qube. No improvement :confused:

For clarification: if native Fedora takes time X to build C++ projects, does this mean Qubes OS takes (X / (1 - 0.58)) time?

Yes! I don't have the raw benchmark data anymore, but I'm pretty sure it's the same problem that causes the sysbench deviations.

AlxHnr commented 2 years ago

The result of sysbench inside an AppVM heavily dependends on how many other AppVMs are running. Running only one single AppVM brings the results from sysbench cpu run pretty damn close to the values I get on native Fedora. I assume it's some form of scheduling issue. Running only one single AppVM also improves sysbench memory run results, but it's still way off compared to native Fedora.

I tried running the memory benchmark in dom0, where it is significantly faster. Here the results of sysbench memory run:

Environment	MiB/sec
Native Fedora	5290
dom0	4525
domU (only 1 AppVM)	401
domU (+7 other AppVMs)	267

I'm not sure if there is anything special about my system. My entire xen and kernel setup is pretty vanilla. Only deviation is qubes.enable_insecure_pv_passthrough because I don't have an IOMMU. Enabling/Disabling this flag makes no difference.

DemiMarie commented 2 years ago

@AlxHnr Can you try sysbench memory run in a PV VM (not dom0, sys-net, or sys-usb)?

marmarek commented 2 years ago

Is the domU a PVH (the default on qubes, if no PCI devices)? What CPU that is?

AlxHnr commented 2 years ago

Is the domU a PVH (the default on qubes, if no PCI devices)?

Yes. All my domU's are PVH (default), except sys-usb and sys-net.

@AlxHnr Can you try sysbench memory run in a PV VM (not dom0, sys-net, or sys-usb)?

VM Type	MiB/sec
PVH	243.81
HVM	216.34
PV	54.41

Giving the PV VM more cores and memory makes no difference. PV VMs are slow and laggy to the point of being unusable.

What CPU that is?

i7-3632qm. It supports VT-d, but my motherboard/BIOS/whatever does not.

I hope this problem is not specific to my setup. My goal here is to get to a point where others can reproduce these problems. I don't have much time and care less about temporary fixes for myself. I care more about achieving sane defaults that work for everybody.

brendanhoar commented 2 years ago

I'm seeing about an 8x slowdown in sysbench memory run on a domU PVH vs. dom0 on my ancient quad Sandy Bridge.

B

DemiMarie commented 2 years ago

Is the domU a PVH (the default on qubes, if no PCI devices)?

Yes. All my domU's are PVH (default), except sys-usb and sys-net.

@AlxHnr Can you try sysbench memory run in a PV VM (not dom0, sys-net, or sys-usb)?

VM Type MiB/sec

PVH 243.81

HVM 216.34

PV 54.41

Giving the PV VM more cores and memory makes no difference. PV VMs are slow and laggy to the point of being unusable.

dom0 is itself a PV VM, so that is strange.

@andyhhp: Do you what could cause such a huge difference between PV dom0 and PV domU? Are super pages only allowed to be used by dom0?

What CPU that is?

i7-3632qm. It supports VT-d, but my motherboard/BIOS/whatever does not.

I hope this problem is not specific to my setup. My goal here is to get to a point where others can reproduce these problems. I don't have much time and care less about temporary fixes for myself. I care more about achieving sane defaults that work for everybody.

Me too.

andyhhp commented 2 years ago

Are super pages only allowed to be used by dom0?

PV guests cannot use superpages at all. dom0 doesn't get them either.

Do you what could cause such a huge difference between PV dom0 and PV domU?

Numbers this bad are usually PV-L1TF and IvyBridge is affected, but Qubes has SHADOW compiled out so it's not that. Do you have xl dmesg from the system? I'm rather lost for ideas.

DemiMarie commented 2 years ago

Are super pages only allowed to be used by dom0?

PV guests cannot use superpages at all. dom0 doesn't get them either.

Makes sense, I see that superpage support on PV got ripped out in 2017. Not surprising in retrospect, considering that at least two of the fatal flaws in PV were due to it.

Do you what could cause such a huge difference between PV dom0 and PV domU?

Numbers this bad are usually PV-L1TF and IvyBridge is affected, but Qubes has SHADOW compiled out so it's not that. Do you have xl dmesg from the system? I'm rather lost for ideas.

@AlxHnr Can you provide xl dmesg? That should give the Xen log. Please be sure to redact any sensitive information before posting it.

brendanhoar commented 2 years ago

Numbers this bad are usually PV-L1TF and IvyBridge is affected, but Qubes has SHADOW compiled out so it's not that. Do you have xl dmesg from the system? I'm rather lost for ideas

Just as an aside, under R4.0 on Sandy Bridge xl dmesg says:

PV L1TF shadowing: Dom0 disabled, DomU enabled

Just checked R4.1 on i7-8850H and same result.

B

DemiMarie commented 2 years ago

PV L1TF shadowing: Dom0 disabled, DomU enabled

That’s normal. The L1TF mitigation code enables shadow paging if the hypervisor was built with that, or calls domain_crash() otherwise.

DemiMarie commented 2 years ago

@fepitre can you provide an xl dmesg from a machine that has performance problems under Xen?

marmarek commented 2 years ago

Just checked R4.1 on i7-8850H and same result.

i7-8750H here, about the same result. xl dmesg

DemiMarie commented 2 years ago

Just checked R4.1 on i7-8850H and same result.

i7-8750H here, about the same result. xl dmesg

Thanks! Would you mind posting sysbench results?

AlxHnr commented 2 years ago

Here the important bits from `xl dmesg`

``` (XEN) Built-in command line: ept=exec-sp (XEN) parameter "no-real-mode" unknown! Xen 4.14.4 (XEN) Xen version 4.14.4 (mockbuild@[unknown]) (gcc (GCC) 10.3.1 20210422 (Red Hat 10.3.1-1)) debug=n Wed Mar 9 00:00:00 UTC 2022 (XEN) Latest ChangeSet: (XEN) Bootloader: GRUB 2.04 (XEN) Command line: placeholder console=none dom0_mem=min:1024M dom0_mem=max:4096M ucode=scan smt=off gnttab_max_frames=2048 gnttab_max_maptrack_frames=4096 no-real-mode edd=off (XEN) Video information: (XEN) VGA is graphics mode 1920x1080, 32 bpp (XEN) Disc information: (XEN) Found 0 MBR signatures (XEN) Found 2 EDD information structures (XEN) EFI RAM map: [...] (XEN) System RAM: 16203MB (16592532kB) (XEN) Domain heap initialised (XEN) ACPI: 32/64X FACS address mismatch in FADT - [...]/0000000000000000, using 32 (XEN) IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23 (XEN) Enabling APIC mode: Flat. Using 1 I/O APICs (XEN) PCI: Not using MCFG for segment 0000 bus 00-3f (XEN) Switched to APIC driver x2apic_phys (XEN) CPU0: 1200 ... 2200 MHz (XEN) xstate: size: 0x340 and states: 0x7 (XEN) Speculative mitigation facilities: (XEN) Hardware hints: (XEN) Hardware features: IBPB IBRS STIBP SSBD L1D_FLUSH MD_CLEAR (XEN) Compiled-in support: INDIRECT_THUNK (XEN) Xen settings: BTI-Thunk RETPOLINE, SPEC_CTRL: IBRS- STIBP- SSBD-, Other: IBPB L1D_FLUSH VERW BRANCH_HARDEN (XEN) L1TF: believed vulnerable, maxphysaddr L1D 46, CPUID 36, Safe address 1000000000 (XEN) Support for HVM VMs: MSR_SPEC_CTRL RSB EAGER_FPU MD_CLEAR (XEN) Support for PV VMs: MSR_SPEC_CTRL EAGER_FPU MD_CLEAR (XEN) XPTI (64-bit PV only): Dom0 enabled, DomU enabled (without PCID) (XEN) PV L1TF shadowing: Dom0 disabled, DomU enabled (XEN) Using scheduler: SMP Credit Scheduler rev2 (credit2) (XEN) Initializing Credit2 scheduler (XEN) Platform timer is 14.318MHz HPET (XEN) Detected 2195.012 MHz processor. (XEN) Unknown cachability for MFNs 0xa0-0xbf (XEN) Unknown cachability for MFNs 0xdb000-0xdf9ff (XEN) Unknown cachability for MFNs 0x41e600-0x41efff (XEN) I/O virtualisation disabled (XEN) Enabled directed EOI with ioapic_ack_old on! (XEN) ENABLING IO-APIC IRQs (XEN) -> Using old ACK method (XEN) Allocated console ring of 16 KiB. (XEN) VMX: Supported advanced features: (XEN) - APIC MMIO access virtualisation (XEN) - APIC TPR shadow (XEN) - Extended Page Tables (EPT) (XEN) - Virtual-Processor Identifiers (VPID) (XEN) - Virtual NMI (XEN) - MSR direct-access bitmap (XEN) - Unrestricted Guest (XEN) HVM: ASIDs enabled. (XEN) HVM: VMX enabled (XEN) HVM: Hardware Assisted Paging (HAP) detected (XEN) HVM: HAP page sizes: 4kB, 2MB (XEN) Brought up 4 CPUs (XEN) Scheduling granularity: cpu, 1 CPU per sched-resource (XEN) Dom0 has maximum 856 PIRQs (XEN) Xen kernel: 64-bit, lsb, compat32 (XEN) Dom0 kernel: 64-bit, PAE, lsb, paddr [...] (XEN) PHYSICAL MEMORY ARRANGEMENT: [...] (XEN) Dom0 has maximum 4 VCPUs (XEN) Initial low memory virq threshold set at 0x4000 pages. (XEN) Scrubbing Free RAM in background (XEN) Std. Loglevel: Errors and warnings (XEN) Guest Loglevel: Nothing (Rate-limited: Errors and warnings) (XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input) (XEN) Freed 540kB init memory ```

andyhhp commented 2 years ago

Hmm - sadly nothing helpful there. Not terribly surprising as it's a release hypervisor, but that's no guarantee that a debug Xen would be any more helpful.

As an unrelated observation, @marmarek you can work around:

(XEN) parameter "no-real-mode" unknown!

by backporting xen-project/xen@e44d986084760 and xen-project/xen@e5046fc6e99db which will silence the spurious warning.

marmarek commented 2 years ago

I've tried Xen with spec-ctrl=no but nothing changed (7128.94 MiB/sec in dom0, 769.83 MiB/sec in domU). Relevant Xen messages:

(XEN) Speculative mitigation facilities:
(XEN)   Hardware hints: RSBA
(XEN)   Hardware features: IBPB IBRS STIBP SSBD L1D_FLUSH MD_CLEAR SRBDS_CTRL
(XEN)   Compiled-in support: INDIRECT_THUNK
(XEN)   Xen settings: BTI-Thunk JMP, SPEC_CTRL: IBRS- STIBP- SSBD-, Other: SRB_LOCK-
(XEN)   L1TF: believed vulnerable, maxphysaddr L1D 46, CPUID 39, Safe address 8000000000
(XEN)   Support for HVM VMs: MD_CLEAR
(XEN)   Support for PV VMs: MD_CLEAR
(XEN)   XPTI (64-bit PV only): Dom0 disabled, DomU disabled (with PCID)
(XEN)   PV L1TF shadowing: Dom0 disabled, DomU disabled

marmarek commented 2 years ago

When calling sysbench memory run --memory-block-size=16K I get significantly closer numbers (20813.22 MiB/sec in dom0 vs 8135.43 MiB/sec in PVH domU). PV domU performs even worse (5140.72 MiB/sec). The difference between PV dom0 and PV domU surprises me.

DemiMarie commented 2 years ago

When calling sysbench memory run --memory-block-size=16K I get significantly closer numbers (20813.22 MiB/sec in dom0 vs 8135.43 MiB/sec in PVH domU). PV domU performs even worse (5140.72 MiB/sec). The difference between PV dom0 and PV domU surprises me.

It surprises me too. @andyhhp do you have suggestions for debugging this? Is there a way to get stats on TLB misses? I wonder if CPU pinning would help.

brendanhoar commented 2 years ago

[Summary: sysbench's event timing interacts poorly with the high-overhead xen clocksource in PV and some PVH VMs.]

I think we may be seeing a mirage, or rather, a side effect of other system calls being made in parallel with the memory ones.

I played around a bit with strace -f sysbench...

...and noticed that under domU PV but not under dom0 PV, I saw an additional 75K lines in systrace output with this pattern:

[pid 2717] clock_gettime(CLOCK_MONOTONIC, {tv_sec=10331, tv_nsec=199069018}) = 0

After some additional experimenting and googling, I found that I can get "terrible sysbench results" from PV dom0 by performing the following (as root):

echo "xen" > /sys/devices/system/clocksource/clocksource0/current_clocksource # change to what domU uses

And I can then "restore good sysbench results" from PV dom0 by performing the following (as root):

echo "tsc" > /sys/devices/system/clocksource/clocksource0/current_clocksource # the default for dom0

Here's where it gets even stranger (caveat: testing on two different pieces of hardware)

Under R4.0 (Xen 4.8), PVH domU uses "xen" as the clocksource by default but it does not have as severe as an impact, with performance closer to dom0. Under R4.1 (Xen 4.14), PVH domU uses "xen" as the clocksource by default and appears to be as severely impacted as PV domU, at least on this particular system.

Even more fun: Under R4.0 PVH domU only have "xen" available as a clocksource, so I can't reverse the experiment with R4.0 Under R4.1 PVH domU default to "xen" but DOES have an available_clocksource of "tsc xen". If I perform do the echo "tsc" command above inside a R4.1 PVH domU, I suddenly "see good sysbench results".

To reiterate: I don't think this is a memory performance problem.

B

DemiMarie commented 2 years ago

...and noticed that under domU PV but not under dom0 PV, I saw an additional 75K lines in systrace output with this pattern:

[pid 2717] clock_gettime(CLOCK_MONOTONIC, {tv_sec=10331, tv_nsec=199069018}) = 0

Yeah, that is definitely not going to be fast :laughing:. Can you provide concrete numbers? @fepitre would you be willing to see if this helps your problems?

brendanhoar commented 2 years ago

R4.1 invoking

sysbench memory run

dom       VMType   Clocksource        Measurement
---       ------   -----------        -----------
dom0      PV       tsc (default)   7276.61 MiB/sec
dom0      PV       xen                4.79 MiB/sec

domU      PV       xen (default)      4.56 MiB/sec
domU      PV       tsc             7012.89 MiB/sec

domU      PVH      xen (default)      5.19 MiB/sec
domU      PVH      tsc             7158.07 MiB/sec

Again, working theory is that it's not an actual memory allocation speed issue, but an issue with how sysbench does timing paired with the relatively high-overhead "xen" clocksource.

EDIT: correction to chart above, tsc is available in R4.1 domU PV. (at least on this hardware)

brendanhoar commented 2 years ago

Unsurprisingly, it looks like someone else ran into the same issue nine years ago using sysbench under the kvm-clock clocksource... https://blog.siphos.be/2013/04/comparing-performance-with-sysbench-part-3/

...nothing new under the sun.

After some additional googling, I also want to note that quite a few folks found that for certain workloads in AWS's XEN-based instances over the past 5-10 years (e.g. linux database performance tracking), particularly where timers are used heavily, switching from "xen" to "tsc", when available, had a material impact on performance.

Brendan

andyhhp commented 2 years ago

Ah. clocksources. An unmitigated set of disasters on native as well as under virt.

For Qubes, you're not migrating VMs, so it's safe to expose the Invariant TSC CPU feature to all guests, which is almost certainly what is triggering the different default between dom0 and domU. Set itsc=1 in the VM config file.

marmarek commented 2 years ago

Thanks! That's Xen 4.14, it's nomigrate=1 there. Now it's just a matter of setting it via libvirt...

andyhhp commented 2 years ago

Urgh. I've been trying to kill nomigrate and it has a habit of segfaulting libvirt for reasons we never got to the bottom of.

It might be easier to pass cpuid="host:invtsc=1"

logoerthiner1 commented 2 years ago

May I have several questions on this?

Is the selection of clock sources (xen or tsc) affect only the accuracy of timing or affect the actual performance when doing sysbench?
Will normal user need to manually config the clocksource after the update, or will the update set the clocksource of all domU to be tsc by default?

crat0z commented 2 years ago

@logoerthiner1

Accuracy is not affected as far as I'm aware
No manual configuration required since Qubes dynamically creates libvirt xml for every VM, using a template xml file. See the above commit for more info

andyhhp commented 2 years ago

Accuracy is fine. The performance difference is between gettimeofday() completing in the VDSO without a system call, vs needing a system call.

marmarek commented 2 years ago

It might be easier to pass cpuid="host:invtsc=1"

I tried, looks to be ignored (https://github.com/xen-project/xen/blob/stable-4.14/xen/arch/x86/cpuid.c#L661-L667)

andyhhp commented 2 years ago

It might be easier to pass cpuid="host:invtsc=1"

I tried, looks to be ignored (https://github.com/xen-project/xen/blob/stable-4.14/xen/arch/x86/cpuid.c#L661-L667)

:disappointed: This is bringing back the scars of trying to fix the mess. Begrudgingly, yes, use nomigrate on 4.14. You will have to change it when you move to a newer Xen.

marmarek commented 2 years ago

Ok, I've set nomigrate and confirmed that guest sees INVTSC bit set. But Linux still chooses "xen" clocksource by default :/

marmarek commented 2 years ago

I can't find any part of Linux that would use INVTSC to affect clocksource choice. All I see is "rating" - "tsc" has 400, "xen" has 500. And the highest available wins.

DemiMarie commented 2 years ago

I can't find any part of Linux that would use INVTSC to affect clocksource choice. All I see is "rating" - "tsc" has 400, "xen" has 500. And the highest available wins.

Ad-hoc kernel patch time? (yes, yuck)

marmarek commented 2 years ago

Nope, clocksource=tsc to kernel cmdline.

brendanhoar commented 2 years ago

I stumbled on this painful thread from ~15 years ago w/r/t the xen vs tsc timers in VMs, granted from an era where it appears the tsc clocksource was only beginning to be reliable across cores.

https://sourceforge.net/p/kvm/mailman/kvm-devel/thread/47267832.1060003%40zytor.com/?page=0

My empathy for @andyhhp (well, all involved in Xen and the Xen<->Linux border) just increased 10-fold.

B

brendanhoar commented 2 years ago

@marmarek - I see you're applying the clocksource=tsc change to stubdoms as well. I'm curious about what you found in testing?

B

marmarek commented 2 years ago

That's mostly in a hope to improve audio quality (pulseaudio reads clock very often), if one uses emulated sound card for HVM (with Windows, for example).

brendanhoar commented 2 years ago

That's mostly in a hope to improve audio quality (pulseaudio reads clock very often), if one uses emulated sound card for HVM (with Windows, for example).

Ah, yeah, that'd be a nice unexpected win.

B

qubesos-bot commented 2 years ago

Automated announcement from builder-github

The component linux-kernel-latest (including package kernel-latest-5.17.4-2.fc25.qubes) has been pushed to the r4.0 testing repository for dom0. To test this update, please install it with the following command:

sudo qubes-dom0-update --enablerepo=qubes-dom0-current-testing

Changes included in this update

qubesos-bot commented 2 years ago

Automated announcement from builder-github

The component linux-kernel-latest (including package kernel-latest-5.17.4-2.fc32.qubes) has been pushed to the r4.1 testing repository for dom0. To test this update, please install it with the following command:

sudo qubes-dom0-update --enablerepo=qubes-dom0-current-testing

Changes included in this update

qubesos-bot commented 2 years ago

Automated announcement from builder-github

The component linux-kernel (including package kernel-5.10.112-1.fc32.qubes) has been pushed to the r4.1 testing repository for dom0. To test this update, please install it with the following command:

sudo qubes-dom0-update --enablerepo=qubes-dom0-current-testing

Changes included in this update

brendanhoar commented 2 years ago

That's mostly in a hope to improve audio quality (pulseaudio reads clock very often), if one uses emulated sound card for HVM (with Windows, for example).

Ah, yeah, that'd be a nice unexpected win.

Hmm, windows audio still crackly after this morning's kernel updates which seem to have moved qubes template based VMs to currentclocksource=tsc.

Ah wait...stub domain change isn't yet pushed to dom0, I still see currentclocksource=xen in stubdomain for windows.

B

jevank commented 2 years ago

In my environment the sound became perfect :)

I thought first of all about the USB webcam, but there alas without much progress on Win10

qubesos-bot commented 2 years ago

Automated announcement from builder-github

The component vmm-xen-stubdom-linux (including package xen-hvm-stubdom-linux-1.2.4-1.fc32) has been pushed to the r4.1 testing repository for dom0. To test this update, please install it with the following command:

sudo qubes-dom0-update --enablerepo=qubes-dom0-current-testing

Changes included in this update

brendanhoar commented 2 years ago

Hmm

R4.1 win 10 still crackly after the just-released stubdomain update as well and in stubdomain I see currentclocksource=tsc.

It gets less crackly (but not perfect) if I busybox renice -n -30 $pulseaudio_pid in the stubdomain.

I haven't rebuilt QWT since December 2021 tabit-pro repo content, so perhaps I need to do so for improved audio as well?

B

AlxHnr commented 2 years ago

I've just installed kernel-5.10.112-1.fc32.qubes and xen-hvm-stubdom-linux-1.2.4-1.fc32. This significantly improves the results of synthetic benchmarks like sysbench memory run. But doesn't fix the performance problems we are facing:

Benchmark: compiling Linux kernel 5.17.4 with basic `make defconfig` for x86_64

Environment	Minutes
Fedora 35 (native)	6:42
Fedora 35 (Qubes AppVM)	13:09

Please reopen this ticket.

QubesOS / qubes-issues