rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
96.81k stars 12.5k forks source link

Tracking issue for #[cfg(target_has_atomic = ...)] #32976

Closed alexcrichton closed 2 years ago

alexcrichton commented 8 years ago

Tracking https://github.com/rust-lang/rfcs/pull/1543

sfackler commented 8 years ago

This is still blocked on some reasonable way to talk about platform-specificity in a more granular way than what we currently support.

Manishearth commented 7 years ago

Is there any reason why AtomicU32 would be hard to stabilize? We need a way to store multi-word bitflags in an atomic thingy. AtomicUsize won't work here since it's too large on 64 bit.

sfackler commented 7 years ago

What do you mean by "too large"?

Manishearth commented 7 years ago

Let's say I have a 64-bit bucket of flags. I need two atomic u32s to represent it. I can't use two usizes because that's too large on 64 bit.

Alternatively, let's say I have a 32-bit bucket of flags. This is on a struct where size matters. I need a u32, not something larger on 64-bit.

I guess I could cfg() it and abstract over it. That solves the first problem but not the second.

nagisa commented 7 years ago

@Manishearth 16-bit-usize targets would give you 16 bit atomic width. 100% savings over 32bit systems! (and potential loss of data)

There isn’t really much problem stabilising any of these IMO, but you’ll still have to CFG on target_has_atomic="32" in your code.

Manishearth commented 7 years ago

16-bit-usize targets would give you 16 bit atomic width.

ugh, these exist, right.

What do I have to do to get these stabilized? I'm okay with the cfgs.

I'm also okay with just u8 being stabilized, as long as I can pack it (not sure if this is possible with atomics)

Manishearth commented 7 years ago

What do I have to do to get these stabilized?

Any answer on this? cc @alexcrichton

The rfc looks like it's been completely implemented. IMO stabilizing it with target_has_atomic is fine, and we can add further granularity if we need in new rfcs.

alexcrichton commented 7 years ago

@Manishearth state hasn't changed from before. The libs team is essentially waiting for a conclusion on the scenarios story before stabilizing.

Manishearth commented 7 years ago

That discussion has stalled. Last I checked it seemed to mostly have come to consensus on the general idea? Who are we waiting on for a conclusion here? What would the ETA on this be? How can I help?

It also seems like that's mostly orthogonal to this. We can stabilize this using cfgs, and add scenarios later. They seem to be pretty compatible.

Manishearth commented 7 years ago

(After discussing in IRC, it seems like there have been a lot of recent discussions about this, and the libs team is currently moving on pushing this forward.)

archshift commented 7 years ago

Has there been any progress on this since?

bholley commented 7 years ago

Just FYI: Per [1], Gecko's Quantum CSS project is no longer blocked on this issue.

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1336646

nagisa commented 7 years ago

We can stabilise AtomicU8 and AtomicI8 the same way we have stabilised AtomicBool – by using Atomic{I/U}Size operations everywhere.

main-- commented 7 years ago

FWIW, my integer-atomics crate is a stopgap solution for everyone who is stuck on stable for now. However, it emulates using AtomicUsize cmpxchg-loops, so performance is bad compared to genuine atomic operations.

Amanieu commented 6 years ago

Is there still a good reason preventing this from being stabilized?

sfackler commented 6 years ago

Portability is still the concern I believe.

gnzlbg commented 6 years ago

What C++ does here is provide the atomics on a best-effort basis, and those that aren't supported by the target are emulated using a mutex or similar.

The alternative is to just provide the atomic types in the architectures in which they are actually available. I prefer this approach since it allows third-party crates to provide a portable solution in stable, but doesn't require us to do so in core.

It also seems that progress on scenarios is dead?

scottmcm commented 6 years ago

I think scenarios became https://github.com/rust-lang/rfcs/pull/1868, so these would be #[cfg]'d but they'd be part of the mainstream configs so most people could use them without getting warnings or doing anything special?

gnzlbg commented 6 years ago

Is anyone working on finishing the implementation of this RFC? If not, I would like to give it a try and would be looking for a mentor.

Amanieu commented 6 years ago

The atomic types are currently already guarded by cfg(target_has_atomic) so there shouldn't be any trouble integrating this with a portability lint in the future.

gnzlbg commented 6 years ago

I was mostly interested on the parts of the original RFC that are not implemented yet, like the 128bit atomic types. I wanted to have an atomic pair of usizes, and can't really do so portably without those types.

Amanieu commented 6 years ago

See #39590 and #38959

tbu- commented 6 years ago

I think the ATOMIC_*_INIT constants of the not-yet stabilized atomics should be removed before stabilization, they're a crutch from a time without stable const functions.

ghost commented 5 years ago

Can we get an update, what are the current blockers for stabilization?

spacejam commented 5 years ago

This is something I'm quite eager for, and if there's any work that could be shared, I would be happy to take some on to get this out the door.

alexcrichton commented 5 years ago

AFAIK the main blocker here for stabilization is that these are not platform-agnostic types. Almost everything in the standard library is available on all platforms in one way or another, and these would be the first additions to libstd in a non-platform-specific location that aren't available on some "upper tier" platforms.

We in the standard library do not have a story of how to provide access these types while maintaining the platform portability guarantees of the standard library. The "portability lint" was supposed to unblock this in theory. Effort has stalled out on that and this, however.

I believe there aren't any technical issues blocking this, only issues around how we expose these APIs in libstd and where we expose them.

gnzlbg commented 5 years ago

AFAIK the main blocker here for stabilization is that these are not platform-agnostic types. Almost everything in the standard library is available on all platforms in one way or another, and these would be the first additions to libstd in a non-platform-specific location that aren't available on some "upper tier" platforms.

Could you elaborate about what exactly is the problem here? We already have the unsigned atomic integer types, and AFACT other types of the RFC can be implemented for all platforms as a "thin" wrapper over those, e.g., the AtomicIX can be implemented on top of the AtomicUX types, the AtomicBool type on top of AtomicU8, and the pointer types on top of AtomicUsize.

sfackler commented 5 years ago

The AtomicUX types are not stable.

alexcrichton commented 5 years ago

@gnzlbg currently the existence of AtomicU8 is a promise that the platform actually has instructions which operate on just one byte as opposed to implementing some form of emulation and/or fallback in libstd. In that sense not all architectures have all the types (notably AtomicU64 is lacking on a few).

gnzlbg commented 5 years ago

maybe we could move some of these to std::arch ?

ollie27 commented 5 years ago

AtomicBool is already stable so we already promise we can support one byte atomic operations so there's no reason we can't stabilise AtomicU8 and AtomicI8 now. 16 bit atomics could be stabilised as well because AtomicUsize is stable and usize is at least 16 bits. I would even argue that we can stabilise 32 bit atomics because it's only 16 bit platforms that might not support them and they probably have bigger portability concerns than whether or not AtomicU32 and AtomicI32 are available.

alexcrichton commented 5 years ago

@gnzlbg it's true! That didn't exist when this issue was created, but it seems like a reasonable-ish place to me to put these types if literally the only thing blocking them is the portability part (which I think is the state of play right now)

@ollie27 the part about AtomicBool could be considered an accidental regression from #33579 because target_has_atomic = "ptr" isn't necessarily guaranteed to be the same as target_has_atomic = "8", although it may be for most of our platforms we have today. AtomicBool specifically has been around since 1.0, so our hands are tied there but we can possibly make more proactive decisions about future types.

ollie27 commented 5 years ago

because target_has_atomic = "ptr" isn't necessarily guaranteed to be the same as target_has_atomic = "8"

Why wouldn't it be possible to implement AtomicBool or AtomicU8 in terms of AtomicUsize?

Amanieu commented 5 years ago

because target_has_atomic = "ptr" isn't necessarily guaranteed to be the same as target_has_atomic = "8",

A smaller atomic operation can always be synthesized from a larger one using a compare-exchange loop: modify a subset of the larger word, and use compare-exchange to attempt to commit the results, loop until it succeeds.

gnzlbg commented 5 years ago

@Amanieu I wonder how that works for an [AtomicU8; N]. In a target that only has 32-bit atomic operations, can the AtomicU8 be 8 bits wide or do they have to be larger for that to work? If they are 8 bits wide, how does this work ? I mean, if you are applying 32-bit wide atomic operaitons on the u8s, you would be doing operation on neighborging bytes, might end up reading out-of-bounds, etc. or how is this avoided?

ghost commented 5 years ago

@Amanieu Sounds like https://docs.rs/integer-atomics?

The trick seems wildly unsound at first sight (because we're touching memory falling outside the small atomic value), but... is actually not?

Amanieu commented 5 years ago

@stjepang Yes, that's exactly what I am talking about.

@gnzlbg This is "safe" in practice since a) the 32-bit atomic operation is 4-byte aligned and therefore can't cross a page boundary, and b) the compare-exchange loop guarantees that the neighboring bytes are never modified by the operation: if they were then the compare-exchange would fail and the loop will retry with the new value for the neighboring bytes.

RalfJung commented 5 years ago

@Amanieu That analysis ignores the compiler, and LLVM alias analysis heavily relies on assumptions like getelementptr inbounds and "all accesses are inbounds". Feel free to do this in inline assembly, but if it goes through the optimizer I'd say that's extremely risky.

gnzlbg commented 5 years ago

I'd say that's extremely risky.

It's not risky, it is UB, this code should fail to run on miri (see https://github.com/rust-rfcs/unsafe-code-guidelines/issues/2).

Amanieu commented 5 years ago

@RalfJung This transformation is actually done inside LLVM inside in the backend passes where it lowers atomic intrinsics to arch-specific opcodes.

gnzlbg commented 5 years ago

@Amanieu I think what @RalfJung means is that the optimization passes that run before the backend are allowed to assume that reads/writes out-of-bounds never happen in the input IR and can therefore optimize under this assumption.

Just because generating machine code that reads/writes out-of-bounds is ok for some targets does not imply that writing Rust code that does (or passing LLVM IR that does) is also ok.

Reads out-of-bounds are undefined behavior in Rust, C, C++, and AFAIK LLVM IR as well, so this is very unlikely to ever be ok (but see the corresponding issue: https://github.com/rust-rfcs/unsafe-code-guidelines/issues/2). Whether it is ok to do so via inline assembly, I don't know, but as long as you clobber everything that the inline assembly modifies (including the memory out-of-bounds), that would probably be ok.

Amanieu commented 5 years ago

But anyways, my point is that any platform that supports one size for atomic operations will automatically support all smaller sizes (even if we need to write custom implementations in inline asm). The only limit is that maximum size that a platform can atomically CAS.

RalfJung commented 5 years ago

This transformation is actually done inside LLVM inside in the backend passes where it lowers atomic intrinsics to arch-specific opcodes.

That changes nothing. Backend passes have different rules for UB, and they do not do the aggressive alias analysis that is performed during the main optimization phase.

Another way to say this is that backend passes operate on a different language, even if it shares the syntax with "main" LLVM IR. Just because something is allowed in assembly or "low-level LLVM IR" doesn't mean it is allowed in "normal LLVM IR", and the rules of the latter are relevant for Rust. Not doing this properly will lead to miscompilations.

alexcrichton commented 5 years ago

@ollie27 @Amanieu yes it's definitely possible (as the discussion has found) to implement atomics this way. The libs team has historically, however, decided to define the existence of AtomicU8 as "there are native instructions for this". Emulation layers are left for crates.io (like integer-atomics).

As to whether such a strategy is even sound, I'll leave that to others!

Amanieu commented 5 years ago

@alexcrichton I remember the libs team decision being that standard library atomic types would be guaranteed to be lock-free, rather than simply "there are native instructions for this". The lock-free property is important since it guarantees that atomic operations can be safely used to communicate with a signal/interrupt handler.

A non-lock-free implementation of atomics would use a spinlock or mutex to emulate atomicity. However the algorithm that I described above (CAS loop which modifies a subset of a word) is lock-free since it does not block while waiting for other threads. Effectively it just uses a slightly longer instruction sequence than you might expect.

It is a bit late to back down on this decision since we already guaranteed the availability of atomics on older ARM architectures (pre-ARMv6) using this exact algorithm: https://github.com/rust-lang-nursery/compiler-builtins/blob/master/src/arm_linux.rs

alexcrichton commented 5 years ago

Hm I may be misremembering the libs team decision! I know for sure we want to guarantee lock-free, but I would personally like to also guarantee hardware/platform support. The pre-ARMv6 strategy you pointed out there was accepted because it's linux-specific, and we may have been overly zealous to accept non-u32 aligned ones there. AFAIK it's not stable other than AtomicBool, so I think we can still end up dropping most of those (and maybe on that one platform switch AtomicBool to word-sized.

I'd prefer to ideally not try to corner ourselves into a decision with historical debt, but rather figure out where we want to go and then work backwards from there. I would personally ideally only like to expose AtomicUXX types for a platform if they're actually supported natively one way or another (either via hardware instructions or the OS level support like pre-ARMv6). We can always add more after that and I think it covers the vast majority of use cases. In the meantime integer-atomics can fill any necessary gaps.

Amanieu commented 5 years ago

I guess this is a difference of perspective: I consider a CAS loop to be "using native hardware instructions" since it uses the native atomic CAS instruction, just with a larger word size. In fact many atomic operations are lowered this way: for example on ARM, a fetch_add is lowered into a ldrex / strex loop which retries if the destination cache line has been modified by another thread.

Anyways, I feel that we are going off topic here. The main point of this issue is that we would like to have integer atomic types available on stable rust. There is a vague concern about the portability, but no concrete proposals to address this.

I personally feel that the current system of having exposing #[cfg(target_has_atomic)] to users is sufficient to address this portability concern. If a crate doesn't compile because AtomicU64 is missing on some platform, the cause should be obvious. We could even (later) add a better error message when attempting to import a nonexistent AtomicU64.

Placing integer atomic types in std::arch is strictly worse than what we have now. These types belong in std::sync::atomic alongside AtomicUsize and AtomicBool.

sfackler commented 5 years ago

IIRC our primary concern was not wanting to end up with AtomicU64 implemented via Mutex<u64> on platforms without 64 bit atomics.

sfackler commented 5 years ago

I personally feel that the current system of having exposing #[cfg(target_has_atomic)] to users is sufficient to address this portability concern.

The objective of the portability lint is to help people be aware upfront of the portability of the code they're writing. With only #[cfg(target_has_atomic)], a crate author has to proactively cfg off their APIs and probably test against targets without those atomics to avoid inevitably breaking the less common case.

alexcrichton commented 5 years ago

@Amanieu the historical portability precedent set by libstd is that all modules are available everywhere except those explicitly whitelisted as "may require some #[cfg] trickery". For example std::os is explicitly defined as "this may require some cfgs", and so is std::arch. In that sense the current instantiation of atomic integer types doesn't fit into this story becaues they're not in a module clearly marked "may require cfgs".

The concrete proposal was to move all these types to the arch modules. I'm realizing now though that doesn't work on arm because target features are required to enable atomic types, not necessarily an architecture-specific type. (Same with WebAssembly and x86, atomic types (or some at least) are only available with certain target features enabled).

This is the main blocker today, these types violate our existing portability story. The solution to this was the venerable portability lint, but that's still quite aways off. If we want to short-circuit the portability lint then we need to develop a stabilization strategy that works in tandem with std's portability policy.