rust-osdev / x86_64

Library to program x86_64 hardware.
https://docs.rs/x86_64
Apache License 2.0
797 stars 132 forks source link

fix(interrupts): replace compiler fences with potentially-synchronizing assembly #440

Closed mkroening closed 1 year ago

mkroening commented 1 year ago

Follow up on https://github.com/rust-osdev/x86_64/pull/436.

I am sorry to open this discussion again, but the previous PR is not sufficient to prevent unexpected behavior.

Compiler fences only synchronize with atomic instructions. When creating a thread-local critical section, we need to prevent reordering of any reads and writes across interrupt toggles, not just atomic ones. To achieve this, we omit nomem from asm!. Since then, the assembly might potentially perform synchronizing operations such as acquiring or releasing a lock, the compiler won't move any reads and writes through these assembly blocks.

See https://github.com/mkroening/interrupt-ref-cell/issues/5#issuecomment-1753047784.

As an unrelated optimization, I also added preserves_flags here, since IF from EFLAGS must not be restored upon exiting the asm block when specifying perserves_flags: Rules for inline assembly—Inline assembly—The Rust Reference

RalfJung commented 1 year ago

I trust @RalfJung that this will do the right thing w.r.t. compiler memory ordering (although I do wish we just had a way to say "volatile" like in C).

Not sure what you mean? We have read_volatile/write_volatile to mirror C's volatile variables? But they don't seem involved here?

Generally I'm not an expert on the concrete implementation of inline assembly, I just know how to deal with it in the abstract, in theory. But I think https://github.com/rust-lang/reference/pull/1413 documents what you need?

josephlr commented 1 year ago

I trust @RalfJung that this will do the right thing w.r.t. compiler memory ordering (although I do wish we just had a way to say "volatile" like in C).

Not sure what you mean? We have read_volatile/write_volatile to mirror C's volatile variables? But they don't seem involved here?

Oh I was referring to GCC's asm volatile which I thought worked similarly to other volatile operations. But after reading the Rust and GCC documentation, it turns out that asm volatile just disables a few optimizations, the same disabled by omitting pure, readonly, nomem in Rust. So this looks good to me.

Thanks for the pointers.

RalfJung commented 1 year ago

Oh you meant asm-block volatile. Yeah I think here Rust chose the opposite default from C: C by default assumes the asm block is pure (or something like that), but Rust by default makes no assumptions and requires the programmer to opt-in to every promise they make. So, Rust picked a safer and more conservative default.

mkroening commented 1 year ago

I adjusted the formatting. 👍

Amanieu commented 1 year ago

The previous implementation using compiler fences was already correct (modulo a small bug: the fence should have been after the cli, not before). The fences will correctly block any problematic movements, as long as you properly use Atomic* types when synchronizing between interrupt and non-interrupt contexts.

mkroening commented 1 year ago

as long as you properly use Atomic* types when synchronizing between interrupt and non-interrupt contexts

That only synchronizes with atomic operations, though. This PR specifically aims to prevent any reordering through these assembly blocks.

Amanieu commented 1 year ago

The sti/cli inline assembly block is itself an "atomic operation" on the interrupt flag, which the compiler fence would apply to.

mkroening commented 1 year ago

But the compiler does not analyze what's inside the asm block, right? If so, we'd need to remove nomem regardless, since nomem implies non-atomicity of the assembly code, does it not?

And even if both compiler_fence and asm synchronize atomically, how would that prevent non-atomic writes to be moved through both (see https://github.com/mkroening/interrupt-ref-cell/issues/5#issuecomment-1752884625)?

RalfJung commented 1 year ago

Yeah I don't see how a nomem block can possibly do any synchronization, that's exactly what https://github.com/rust-lang/reference/pull/1413 is about. The fence needs an atomic read-write pair to do anything, a nomem block cannot contains reads nor writes.