Finish implementing TLB shootdown

dbittman commented 8 months ago

The previous TLB shootdown implementation was not fully hooked up and was using the generic IPI mechanism to signal other cores. This is, in general, not safe, nor is it particularly efficient. This PR does the following things:

Spin-wait

Implements a spin-wait primitive for the kernel to use that allows a general "wait until condition callback returns Some, pause with a different callback" mechanism. The purpose is to allow spin-waiting (e.g. in spinlock implementation) while not blocking certain "critical" work, such as TLB invalidations. Essentially, the design of the interrupt system and TLB invalidation mechanism requires that we always try to handle invalidations even when spin waiting. There's not much getting around something like this, because in principle a shootdown could be triggered by any memory allocation, thus possibly causing a deadlock of the form:

CPU A holds the lock for the allocator CPU B is waiting on the lock for the allocator CPU A triggers a TLB shootdown

If the allocator lock is a spinlock, this is deadlock unless CPU B polls for TLB invalidations while waiting for the lock. If the lock is a mutex, there's no problem! But, in general, without a deep audit to ensure that a TLB shootdown will never be triggered by a CPU that holds a spinlock, we have to have a mechanism like this to prevent deadlock. And, honestly, even if we did that audit, I'd still lean towards this as it's cleaner and safer.

TLB invalidation command distribution

The code now distributes invalidation commands to other CPUs via an arch-specific mechanism. Each CPU maintains a fixed, small array of pending invalidations. When a CPU wants to enqueue an invalidation on another CPU's queue, it tries to place it in the queue using three strategies:

Try to put it in an empty slot
If no empty slots, try to merge it with an existing command that has the same target root page tables
Otherwise, merge it into the first slot (why first? see the code!)

Merge? Yes -- there's a least upper bound merge between any two invalidation commands (see the code). Consider that in the worst case, we can always just promote the existing invalidation to a full, global invalidation.

TLB invalidation command processing

When a shootdown IPI is received, or we are polling for TLB invalidations, we iterate over the pending commands in our CPU's queue, executing them. For x86, this could mean executing a global, full invalidation, in which case any future invalidations don't matter, so we just flush them. When we are done with a command, the slot in the array is cleared to None, so the sender of the shootdown can know that the command has been handled.

Waiting for ~Godot~ shootdown ack

We do have to wait for invalidations to complete, while holding the mappings lock. This is somewhat dubious as it could race with other IPIs, but we do poll for other shootdown commands while waiting. We also delay doing local invalidation command processing until after we have submitted the IPIs, as an optimization.

What about live-lock?

This is for a future PR.

dbittman commented 8 months ago

Oh -- @PandaZ3D let me know if (well, if this makes sense, first) you want me to take a stab at porting this to the arm code or if you'd like to take care of it.

dbittman commented 8 months ago

Oh, and I should mention, before I merge this I will complete an audit of the code for any existing spin-loops and switch them to use the new spin-wait mechanism, unless there's a good reason not to (which will be given a comment).

PandaZ3D commented 8 months ago

Oh -- @PandaZ3D let me know if (well, if this makes sense, first) you want me to take a stab at porting this to the arm code or if you'd like to take care of it.

You can take a stab at it, however, the ARM version of the kernel is lacking IPI (SGI in ARM-speak) support, or SMP boot for that matter. I think we should implement IPI/SMP support in a later PR since this requires writing some driver code.

dbittman commented 8 months ago

Oh -- @PandaZ3D let me know if (well, if this makes sense, first) you want me to take a stab at porting this to the arm code or if you'd like to take care of it.

You can take a stab at it, however, the ARM version of the kernel is lacking IPI (SGI in ARM-speak) support, or SMP boot for that matter. I think we should implement IPI/SMP support in a later PR since this requires writing some driver code.

Okay! Can you open an issue to track the tasks we'll need for smp on arm? I'll leave off arm support from this for now.

twizzler-operating-system / twizzler