rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
96.86k stars 12.51k forks source link

Tracking issue for `thread_local` stabilization #29594

Open aturon opened 8 years ago

aturon commented 8 years ago

The #[thread_local] attribute is currently feature-gated. This issue tracks its stabilization.

Known problems:

aturon commented 8 years ago

@alexcrichton Can you elaborate on the blockers?

alexcrichton commented 8 years ago

Certainly! Known issues to me:

That's at least what I can think of at this time!

alexcrichton commented 8 years ago

A note, we've since implemented cfg(target_thread_local) which is in turn itself feature gated, but this may ease the "this isn't implemented on all platforms" worry.

alexandermerritt commented 8 years ago

Hi! Is there any update on the status? Nightly still requires statics to be Sync. I tried with:

rustc 1.13.0-nightly (acd3f796d 2016-08-28)
binary: rustc
commit-hash: acd3f796d26e9295db1eba1ef16e0d4cc3b96dd5
commit-date: 2016-08-28
host: x86_64-unknown-linux-gnu
release: 1.13.0-nightly
mneumann commented 7 years ago

@alexcrichton Any news on #[thread_local] becoming stabilized? AFAIK, at the moment it is impossible on DragonFly to access errno variable from stable code, other than directly from libstd. This blocks crates like nix on DragonFly, which want to access errno as well, but libstd is not exposing it, and stable code is not allowed to use feature(thread_local).

alexcrichton commented 7 years ago

@mneumann no, no progress. I'd recommend a C shim for now.

mneumann commented 7 years ago

@alexcrichton thanks. I am doing a shim now https://github.com/mneumann/errno-dragonfly-rs.

real-felix commented 7 years ago

The optimizations are too aggressive ;)

See this code:

#![feature(thread_local)]

#[thread_local]
pub static FOO: [&str; 1] = [ "Hello" ];

fn change_foo(s: &'static str) {
    FOO[0] = s;
}

fn main() {
    println!("{}", FOO[0]);
    change_foo("Test");
    println!("{}", FOO[0]);
}

The compiler does not detect the side effect in change_foo and removes the call in release. The output is:

Hello
Hello
alexcrichton commented 7 years ago

cc @eddyb, @Boiethios your example shouldn't actually compile because it should require static mut, not just static

real-felix commented 7 years ago

It compiles with the last nightly rustc.

eddyb commented 7 years ago

Oh, drat, this is from my shortening of the lifetime, i.e https://github.com/rust-lang/rust/blob/dead08cb331343b84564628b139b657f93548320/src/librustc/middle/mem_categorization.rs#L657-L662 @nikomatsakis what should we do here? I want a static lvalue, with a non-'static lifetime.

cramertj commented 6 years ago

18001, #17954 and https://github.com/rust-lang/rust/issues/18712 were fixed by https://github.com/rust-lang/rust/pull/43746.

@alexcrichton @eddyb Do you know any other blockers, or is this ready for stabilization?

eddyb commented 6 years ago

There's some emulation clang does IIRC, that we might want to do ourselves, to support #[thread_local] everywhere.

And there's #47053 which results from my initial attempt to limit references to thread-local statics to the function they were created in.

alexcrichton commented 6 years ago

@cramertj I've personally been under the impression that we're holding out on stabilizing this for as long as possible. We've stabilized very few (AFAIK) platform-specific attributes like this and I at least personally haven't ever looked to hard into stabilizing this.

One blocker (in my mind at least) is what @eddyb mentioned where this is a "portable" attribute yet LLVM has a bunch of emulation on targets that don't actually support it (I think MinGW is an example). I don't believe we should allow the attribute on such targets, but we'd have to do a lot of investigation to figure out those targets.

Is there motivation for stabilizing this though? That'd certainly provide some good motivation for digging out any remaining issues and looking at it very closely. I'd just personally been under the impression that there's little motivation to stabilize this other than it'd be a "nice to have" in situations here and there.

Amanieu commented 6 years ago

I am using #[thread_local] in my code in a no_std context: I allocate space for the TLS segment and set up the architecture TLS register to point to it. Therefore I think that it is important to expose the #[thread_local] attribute so that it can at least be used by low-level code.

The only thing that I'm not too happy about is that Sync is no longer required for #[thread_local]: this makes it more difficult to write signal-safe code, since all communication between signal handlers and the main thread should be done through atomic types.

eddyb commented 6 years ago

@Amanieu Signal/interrupt-safe code has other requirements which are not satisfied by current Rust.

Amanieu commented 6 years ago

@eddyb Not really, all you need to do is treat the signal handler as a separate thread of execution. The only requirement you need to enforce safety is that objects which are accessed by both the main thread and the signal handler must be Sync.

Of course you can still cause deadlocks if the main thread and signal handler both try to grab the same lock, but that's not a safety issue.

dtolnay commented 5 years ago

@cramertj regarding remaining blockers, I would consider the behavior that @Boiethios reported above needs to be fixed before stabilization. I filed #54901 to follow up.

josephlr commented 5 years ago

Copying @gnzlbg's comment from https://github.com/rust-lang/libc/pull/1432

It appears that 1) thread_local! solves the problem for most people, and 2) the main uncertainty is that #[thread_local] isn't portable.

Maybe we could reduce the scope of an initial version of #[thread_local] to extern static declarations. If the target does not support #[thread_local], well then the extern static declaration is incorrect since there cannot be a definition anywhere, and using it would already be UB. We could add on top of this a "best effort" compilation error, e.g., if the compiler knows that the target doesn't support it.

I agree with the above, thread_local! and #[thread_local] for extern static unblocks virtually all use-cases, and would allow interfaces to errno using stable Rust on all platforms (see the linked libc issue for more context).

Right now there is no way (with stable Rust) to use C thread local variables via FFI without writing additional C glue code, see the errno-dragonfly crate for an example of such a (necessary) hack.

gnzlbg commented 5 years ago

cc @joshtriplett: allowing #[thread_local] on extern statics might be something for the agenda of the WG-FFI, since interfacing with errno is kind of an important part of the C FFI puzzle.

joshtriplett commented 5 years ago

@gnzlbg Thanks! Agreed, we definitely need thread-local-variable support.

cbiffle commented 4 years ago

Independent of the FFI need for extern static...

Like @Amanieu from last year, I'm using #[thread_local] in a no_std context (an RTOS, in my case), with the OS runtime handling management of the TLS pointer and memory. (They and I differ on one point, which is that I'm delighted that #[thread_local] lifts the Sync requirement for static. It seems good and right.)

#[thread_local] is currently the only unstable feature I have to rely on for program correctness. (I'm using a couple others for convenience, but I could lower them by hand if required. I cannot easily replicate the TLS link behavior by hand.)

I haven't dug into the compiler side of this, so this question may be naive, but is the has_elf_tls LLVM target feature not sufficient to gate this? We have a few other language features (if not attributes per se) that are gated by target support -- for example, my platform doesn't have AtomicU64. So it doesn't seem entirely without precedent.

tuxillo commented 4 years ago

@alexcrichton Any news on #[thread_local] becoming stabilized? AFAIK, at the moment it is impossible on DragonFly to access errno variable from stable code, other than directly from libstd. This blocks crates like nix on DragonFly, which want to access errno as well, but libstd is not exposing it, and stable code is not allowed to use feature(thread_local).

We now provide __errno_location() since this commit:

https://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/60d311380ff2bf02a87700a0f3e6eb53e6034920

bstrie commented 4 years ago

The original issue suggests that the thread_local attribute in LLVM isn't supported across all platforms, but has platform support improved in the time since this issue was opened? Or do we expect that it will always remain nonportable? (Is there an LLVM tracking issue?)

nicbn commented 4 years ago

Hello, #[thread_local] can currently be applied to fields of a struct without any errors or warnings (it won't work oc):

#![feature(thread_local)]

use std::sync::{Arc, Mutex};

#[derive(Debug)]
struct Foo {
    #[thread_local]
    bar: &'static str,
}

fn main() {
    let foo = Arc::new(Mutex::new(Foo {
        bar: "bar",  
    }));

    dbg!(foo.lock().unwrap().bar);

    let foo2 = foo.clone();
    std::thread::spawn(move || {
        foo2.lock().unwrap().bar = "baz";
    }).join().unwrap();

    dbg!(foo.lock().unwrap().bar);
}

Outputs:

[src/main.rs:16] foo.lock().unwrap().bar = "bar"
[src/main.rs:23] foo.lock().unwrap().bar = "baz"
ghost commented 3 years ago

By looking at the issue description, #![feature(thread_local)] seems unsound, but it's not caught by the incomplete_features lint: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=0d95cf5246b5b41ddc6776527fdc89e5

RalfJung commented 3 years ago

Hm, that's fair... OTOH if we made incomplete_features fire we'd have to enable that lint in library/std, so we'd not easily notice when another unsound feature is enabled.

IMO it would be better to just make all accesses to these statics unsafe.

ghost commented 3 years ago

@RalfJung Well, I just read https://github.com/rust-lang/rust/pull/71435#issuecomment-618018581 after commenting and am wondering whether I should edit and hide my comment or not.

ghost commented 3 years ago

@RalfJung It seems that #49682 is the only soundness problem, so this may already have implementation strategy: banning the case in #49682. Ignore me if so. Sorry for bothering.

RalfJung commented 3 years ago

Well, I just read #71435 (comment) after commenting and am wondering whether I should edit and hide my comment or not.

Hm, I do not see the relation to be honest (also I was wrong when I posted that -- there is an implementation strategy).

https://github.com/rust-lang/rust/issues/54366 sounds like it might be a soundness issue if it also happens for #[thread_local] static (without mut).

ghost commented 3 years ago

@RalfJung I think I was confused by confusing comments (not only yours) and probably misread some of them. Sorry. Ignore what I said before.

ghost commented 3 years ago

Hm, that's fair... OTOH if we made incomplete_features fire we'd have to enable that lint in library/std, so we'd not easily notice when another unsound feature is enabled.

IMO it would be better to just make all accesses to these statics unsafe.

core, alloc and stdarch have already used #![allow(incomplete_features)]. Is std special or #![allow(incomplete_features)]s in them are planned to be removed?

Also, I think adding an attribute that suppresses incomplete_features only for specified features could solve the noticing issue generally (also in core, alloc and stdarch).

RalfJung commented 3 years ago

core, alloc and stdarch have already used #![allow(incomplete_features)]

I'd hope there's work happening to get rid of that. :/ I thought since the move to min_specialization, things were better. I guess this is mostly due to const-generics. At the very least there should be comments explaining why unsound features are enabled in the very foundation of Rust... but it seems not everyone agrees on sch a policy, seeing that this code was reviewed and landed. Oh well.

Also, I think adding an attribute that suppresses incomplete_features only for specified features could solve the noticing issue generally (also in core, alloc and stdarch).

That would help, yes.

Soveu commented 3 years ago

With relocation-model=static compiler uses fs "segment" to access thread local variable like expected, but with pic it uses __tls_get_addr() Shouldn't it use fs in both situation?

// relocation-model=static
example::f:
        mov     byte ptr fs:[example::FOO@TPOFF+3], 1
        mov     qword ptr fs:[example::BAR@TPOFF], 2
        ret
// relocation-model=pic
example::f:
        sub     rsp, 8
        lea     rdi, [rip + example::FOO@TLSLD]
        call    __tls_get_addr@PLT
        mov     byte ptr [rax + example::FOO@DTPOFF+3], 1
        mov     qword ptr [rax + example::BAR@DTPOFF], 2
        pop     rax
        ret

Link to godbolt

Chronos-Sk commented 3 years ago

As far as use cases go. All the bare-metal Arm targets seem to support #[thread_local] definitions. Not using #[thread_local] definitions leads to a lot of trouble because it appears there's no easy way, outside of compiler code, to figure out a type's memory layout in time to tell the linker what to do. It's possible to fully emulate TLS, of course, but that could be a lot of overhead for a bare-metal target.

I don't have a problem, personally, sticking to Nightly for this, though.

Edit: I would worry more about these soundness issues in my project, but accessing static muts is already unsafe, so there's a built-in caveat emptor. (Project docs: "Safety: Don't let this reference escape the thread via...")

josephlr commented 3 years ago

@Soveu the POSIX/ELF AMD64 ABI has 4 models for accessing Thread-Local Storage. See this doc on Thread-Local Storage models for more information.

Depending on how you end up linking the binary (and if you use LTO) Rust/LLVM might use one of the "Exec" models even if your relocation model is pic. Using the static relocation model just tells LLVM that the code will definitely be statically linked.

This example playground should be able to show all 4 models (if you switch pic to static)

Chronos-Sk commented 3 years ago

There's also the -Ztls-model Nightly rustc flag, if you want to force a model to be used: https://doc.rust-lang.org/unstable-book/compiler-flags/tls-model.html.

Soveu commented 3 years ago

Interesting, I thought TLS is basically a thread-local copy of .tdata and fs holds an offset between .tdata and the copy

Soveu commented 3 years ago

btw, is there a way to tell rustc to use gs for thread locals instead of fs?

josephlr commented 3 years ago

btw, is there a way to tell rustc to use gs for thread locals instead of fs?

I would also really like this (for Kernels and the like), but I don't think LLVM supports this (as it's not one of the 4 TLS models described above). You would have to get a 5th TLS model (maybe called kernel) added to LLVM to support this.

Soveu commented 3 years ago

I heard some rumors that Redox does that, but I can't find how

josephlr commented 3 years ago

I heard some rumors that Redox does that, but I can't find how

Looks like they modified their binutils: https://gitlab.redox-os.org/redox-os/binutils-gdb/-/merge_requests/5

Andy-Python-Programmer commented 3 years ago

I heard some rumors that Redox does that, but I can't find how

Looks like they modified their binutils: https://gitlab.redox-os.org/redox-os/binutils-gdb/-/merge_requests/5

Would have been better if they made this pull request to actual binutils then we all could use it.

Chronos-Sk commented 3 years ago

Okay, I think I found a way to make things work using just #[thread_local] declarations, instead of full definitions. Only requires some form of asm support. playground

The trick is figuring out how to link the extern declaration to a full definition without LLVM noticing and ignoring the original declaration. The .equiv asm directive seems to accomplish this with minimal voodoo magic.

marmeladema commented 3 years ago

Would it be possible to partially stabilize this for let's say x86_64 architecture in user space code and by also not supporting problematic types like generators etc?

joshtriplett commented 2 years ago

Marking as impl-incomplete since it may still have soundness issues, and may misbehave on platforms without TLS support.

eddyb commented 2 years ago

EDIT: original proposal here was unsound wrt scoped threads, see https://github.com/rust-lang/rust/issues/29594#issuecomment-1125466642

(click to open the original proposal) I'm not sure where this should go, and I don't have the time for a complete RFC, but "a lifetime shorter than `'static`" came up recently in the context of some `rustc` internal data structures (which are owned thread-locally, not coincidentally), so:
*(with apologies to any previous suggestions similar to this, that may have been dismissed in the past - I couldn't find anything in this very issue, at least)* If we added a `'thread` lifetime, like `'static` but *strictly* shorter: * borrowing a `#[thread_local]` (and even `thread_local!`) could produce a `&'thread T` reference * it should even be returnable from functions (as long as it doesn't leave the thread, see below) * to preserve lifetime parametericity, `&'thread T: Send + Sync` *has to* be true if `T: Sync` (just like with `&'a T` for any other `'a`), and soundness of `'thread` usage shouldn't require it * in fact, it should be entirely possible to pass such a `&'thread T` "down" to a scoped thread, just like how `f(&THREAD_LOCAL)` (with `#[thread_local]`) or `THREAD_LOCAL.with(|data| f(data))` (with `thread_local!`) work today * everything that requires `T: 'static` today would still not allow anything containing these new `'thread` lifetimes, and that includes: * potentially-detachable (i.e. non-scoped) threads * any kind of communication between non-scoped threads is also indirectly affected by that bound, i.e. `mpsc::Receiver: 'static` requires `T: 'static` * this also extends to thread pools running `async` tasks: a `Future` from `async fn`, that is required to be `'static`, cannot keep a `&'thread T` across a suspension point * `thread_local!`'s data type * essentially stating "no `&'thread T` can get from the normal thread execution, into `thread_local!` destructor execution" * a `thread_local!` destructor *could itself* obtain a `&'thread T` reference, but it would be limited in scope to that destructor's execution, since there would be no place available to stash it * for this to be sound, I believe `#[thread_local]` also needs the `'static` bound, otherwise it could hold a reference to a `thread_local!`, that a different `thread_local!`'s destructor could read back * there are two styles for "getting short-lived references" APIs today: * `fn with(&self, f: impl FnOnce(&T))` * only option sound today for thread-local `T`s, `thread_local!` uses it * `fn get(&self) -> &T` (`Deref`-like) * with owned `self`, this is unsound because `Box::leak(Box::new(self)).get()` returns a `&'static T`, even if `self` itself is `!Send`/`!Sync` * this is (incorrectly) used in a few places in `rustc`, today, and that's where this whole idea came from * but with `'thread`, we could have the best of both worlds, and return `&'thread T`, to be more flexible than today's sound option (`with`) while (hopefully?) remaining sound * only downside is `Deref` can't be implemented directly (except on a type that has `&'thread T` as a field itself) * a `struct` with `PhantomData<&'thread ()>` would make that type never pass a `'static` bound check, so even `Box::leak(Box::new(self))` would only produce a `&'thread Self`, not a `&'static Self`, meaning a `Deref`-like API would actually remain sound * though this sort of thing could complicate the implementation, unless it's only allowed by making the `struct` lifetime-generic and passing `'thread` in that way Based on the little I remember about how `'static` is implemented, there's a good chance of this being easy to fully implement (copy how `'static` is handled, but make it strictly "shorter" in the lattice), but I wouldn't bet on it just yet. **EDIT**: * while looking around for precedents, I found a full-fledged RFC (https://github.com/rust-lang/rfcs/pull/1705) from 2016, that I had completely forgot about - looking in it now to compare it... * oh, it made the classic mistake of proposing `&'thread T: !Send` (which is impossible because of lifetime parametericity, see above) instead of relying *only* on `'thread < 'static`: > Any type depending on `'thread` (i.e., a type product of the type construction from `'thread`) is `!Send`, and thus bounded to the current thread. * @nikomatsakis caught onto that issue in this comment: https://github.com/rust-lang/rfcs/pull/1705#issuecomment-241499891 * overall I feel like that RFC was not arguing for itself very well - the `fn`-scoped `#[thread_local]` limitation we ended up with instead is enough for soundness, and there is no talk about letting the `LocalKey` API of `thread_local!` use the `'thread` lifetime etc.
newpavlov commented 2 years ago

@eddyb I really hope for this feature to be stabilized as soon as possible, but I must ask the following question:

it should be entirely possible to pass such a &'thread T "down" to a scoped thread

Is it guaranteed that the same thread-local pointer stays valid when passed to a different thread? I can imagine a system which maps the same numeric pointer to different physical addresses for different threads via MMU magic.

Amanieu commented 2 years ago

Is it guaranteed that the same thread-local pointer stays valid when passed to a different thread? I can imagine a system which has the same numeric pointer to point to different physical addresses for different threads via MMU magic.

That is not allowed. Thread-local variables for different threads must have distinct addresses.

eddyb commented 2 years ago

Is it guaranteed that the same thread-local pointer stays valid when passed to a different thread? I can imagine a system which has the same numeric pointer to point to different physical addresses for different threads via MMU magic.

That is not allowed. Thread-local variables for different threads must have distinct addresses.

@newpavlov To expand a bit further: such a pointer could not, in Rust, use the type &T, and instead would have to be some kind of wrapper that only produces a &T if it can compute a "global" pointer (i.e. one valid across all threads).

If T: Sync, then &T: Send holds and the pointer can make its way into a different thread, as long as it's not accessed outside of its original scope. Even with a !Sync pointee, I'd still be wary of any further derived reference existing (&Foo doesn't give Foo as much control over the reference as a separate FooRef type would).

This applies to thread_local!'s .with(|short_lived_ref| ...) method today, which doesn't stop you in any way from e.g. spawning some scoped threads and capturing short_lived_ref in them, inside the closure. (And #[thread_local] statics have their equivalent, where do_anything_with(&THREAD_LOCAL_STATIC) passes the reference "down the stack" to a function that doesn't really see it as any other reference).