Closed BeataZdunczyk closed 1 year ago
Initial work happens on https://github.com/TrenchBoot/xen/tree/smp_cleanup.
Currently assembly part already works in parallel, but it may still require some minor changes around hand-off to C. e.g. Xen CPU number may be passed as argument to start_secondary
instead of depending on static variable.
On C side, some of the code (cpu_up
in particular) is common for x86 and Arm, which makes it impossible to change the flow of one of these architectures without touching the other. As Arm isn't the target of AEM project, I think it will be best to stick to changes to code specific to x86.
cpu_up
in the current form is also used for hot-plugging CPUs on server platforms, a feature that must be preserved. However, the initial INIT-SIPI-SIPI sequence can be broadcast by BSP to all-but-self. For platforms with many cores/threads this can significantly reduce the number of inter-processor calls and delays. Remaining initialization steps (from CPU_STATE_CALLOUT
onward) would still be executed one by one to keep the logic implemented by cpu_up
- while it could be possible to make those parallel as well, this would add a lot of duplicated code which is hard to maintain.
With parallel execution some timeouts will have to be rethought, most notably APs waiting for CPU_STATE_CALLOUT
.
Opened PR. I also planned to add comparisons of boot times with and without those changes, but we're experiencing some infrastructure issues and I couldn't connect to most interesting platforms.
As said before, there is a bit of code common to x86 and Arm. Between breaking that code into smaller parts or calling notifiers for CPU_UP_PREPARE
twice (which in some cases would allocate memory twice, or overwrite structures that could potentially be in use) I decided to let most of the initialization not be parallel. APs start polling cpu_state
basically as soon as they enter C world.
This is list of functions/macros called from start_secondary()
with their dependencies on notifiers that I managed to find, up to smp_callin()
:
set_current(idle_vcpu[cpu]); // common/sched/core.c:cpu_schedule_up(), which in turn needs
// arch/x86/percpu.c:init_percpu_area() and
// common/timer.c:cpu_callback() - should be no-op after BSP calls it
this_cpu(curr_vcpu) = idle_vcpu[cpu]; // arch/x86/percpu.c:init_percpu_area(), idle_vcpu populated by above
rdmsrl(MSR_EFER, this_cpu(efer)); // percpu
load_system_tables(); // percpu + arch/x86/smpboot.c:cpu_smpboot_alloc()
percpu_traps_init(); // percpu
percpu
is already sufficiently protected, it won't do anything when called multiple times. smpboot
is slightly more complicated, it would reallocate memory for per_cpu(compat_gdt, cpu)
and per_cpu(root_pgt, cpu)
. cpu_schedule_up
also allocates resources without checking if they already exist. Given enough time, all of those probably could be resolved, which may allow for better parallelization. However, most of those callbacks are declared static, in order to use them outside of their respective compilation units they would have to be made globally visible, which maybe isn't great idea. They can't be executed at the time callbacks are registered, that would break the order in which they must get executed.
This also seems to be working according to logs linked in https://github.com/TrenchBoot/trenchboot-issues/issues/11#issuecomment-1734465464. Brought up 4 CPUs
printed by Xen and smpboot: Allowing 4 CPUs, 0 hotplug CPUs
printed by kernel prove that all CPU cores are up and running.
Testing on OptiPlex with older CPU (IvyBridge, 3rd generation Intel Core) also doesn't show any regression.
@krystian-hebel IIUC, do you test Dasharo (coreboot+SeaBIOS) ? It would be great to know what the test target is here. Most likely as a part of the marketing effort, I would like to try that on my own hardware.
Is your feature request related to a problem? Please describe.
Currently, the CPU cores are being woken up in parallel for DRTM launch in Qubes OS AEM, but later they are hacked to be waiting in a queue. If any interrupt would come at that time, it could be a serious danger. This needs to be fixed as soon as possible as required by Intel TXT specification.
Is your feature request related to a new idea or technology that would benefit the project? Please describe.
This task is required to ensure the proper functioning of parallel CPU cores bring-up for DRTM launch in Qubes OS AEM as per Intel TXT specification.
Describe the solution you'd like
Implement parallel CPU cores bring-up for DRTM launch in Qubes OS AEM in compliance with the Intel TXT specification.
Describe alternatives you've considered
N/A
Additional context
This feature request is part of Phase 2 in TrenchBoot as Anti Evil Maid project, as outlined in the documentation: https://docs.dasharo.com/projects/trenchboot-aem-v2/.
Relevant documentation you've consulted
N/A