rust-lang / libs-team

The home of the library team
Apache License 2.0
123 stars 19 forks source link

Non-blocking check for `LazyLock`s #402

Closed workingjubilee closed 2 weeks ago

workingjubilee commented 3 months ago

Proposal

Problem statement

You currently cannot check if a LazyLock or LazyCell is initialized (and then read it if it is already initialized) in a non-blocking manner. You can deref, or call force, but those initialize it. Sometimes you just want to check without doing so, for somewhat esoteric reasons (e.g. async-signal-safe code).

Motivating examples or use cases

I use OnceLock for a static variable that gets checked from rustc's signal handler, but really ought to just be a LazyLock, because of this. It's not that bad an amount of code grunge, but it's annoying. The static:

https://github.com/rust-lang/rust/blob/5a3e2a4e921097c8f2bf6ea7565f8abe878cdbd4/compiler/rustc_interface/src/util.rs#L51-L82

The place it gets checked in the signal handler:

https://github.com/rust-lang/rust/blob/d8d5732456d375f7c4bdc2f6ad771989a5e0ae02/compiler/rustc_driver_impl/src/signal_handler.rs#L103-L107

Solution sketch

Add the following function to LazyLock and LazyCell:

impl<T, F> LazyLock<T, F> {
    /// `Some` if LazyLock completed initialization.
    pub fn as_completed(&self) -> Option<&T> {
        todo!()
    }
}

Alternatives

Links and related work

## What happens now? This issue contains an API change proposal (or ACP) and is part of the libs-api team [feature lifecycle]. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months. [feature lifecycle]: https://std-dev-guide.rust-lang.org/development/feature-lifecycle.html
## Possible responses The libs team may respond in various different ways. First, the team will consider the *problem* (this doesn't require any concrete solution or alternatives to have been proposed): - We think this problem seems worth solving, and the standard library might be the right place to solve it. - We think that this probably doesn't belong in the standard library. Second, if there's a concrete solution: - We think this specific solution looks roughly right, approved, you or someone else should implement this. (Further review will still happen on the subsequent implementation PR.) - We're not sure this is the right solution, and the alternatives or other materials don't give us enough information to be sure about that. Here are some questions we have that aren't answered, or rough ideas about alternatives we'd want to see discussed.
pitaj commented 3 months ago

Should probably take this: &Self instead, like force, in order to entirely avoid deref conflicts. That's how Lock::get works.

Some other name options:

the8472 commented 3 months ago

some additional prior art: thread::JoinHandle::is_finished

kennytm commented 3 months ago

If we model after JoinHandle::is_finished there would be only an is_initialized() -> bool function and then

once this returns true, join force can be expected to return quickly, without blocking for any significant amount of time.

but IMO doing this in 2 steps does not feel very rusty.

pitaj commented 3 months ago

It would be kinda weird if the OnceCell API is called get but the LazyCell API is named something different.

I don't really see a problem with

impl<T, F> LazyLock<T, F> {
    fn get(this: Self) -> Option<T>
}
workingjubilee commented 3 months ago

OnceLock is opaque with respect to its contents. It is different.

Further, all of those APIs named get that don't take an additional key or index argument of some kind are a mistake.

workingjubilee commented 3 months ago

The thing saving OnceLock is that in general the use of get is incredibly niche (which makes it having such a terse name feel like an incredible misprioritization) so the real function is get_or_init.

for all others:

get what, motherfucker?

get what?

by not having a second argument and having such a pointlessly vague name, it's totally context free. if you see some x.get() that tells you jack shit about what's coming next. "oh people shouldn't write non-descriptive variable names", yeah, well standard libraries shouldn't have non-descriptive function names.

If we had a trait named Get it wouldn't implement the unary argument form, it would be the two-argument form, because everyone understands get is a kind of field accessor which means it should essentially always be qualified with the notional field, and no, it's not obvious when something has only one internal field because it's internal. Even implementing Deref or DerefMut doesn't reveal that, as plenty things that implement those traits have additional data they could be accessing. The context of the type of the second argument reveals what kind of operation is going on. This may not make x.get(y) particularly more clear per se but it pulls in two threads of the reader's context regarding the program state in order to explain itself[^0]. This makes it much more possible to infer the notional operation[^1] and its resulting output type[^2].

The only reason I don't argue that on every single issue is because I have a limited amount of time and energy and I do not believe anyone is interested in fucking hearing my opinions about how you are supposed to do technical or pedagogic writing.

[^0]: this is important because when reading, one might have started in a slightly unusual place and not have absorbed the full context yet. it makes it easier not only to use both mapping type and key to understand the value, but also to work backward from the key-value pair to understand what the mapping type is doing and what it's for.

[^1]: a string means "probably a hash table or trie access, maybe representing a named field". an integer means the backing index is almost certainly into a slice or other sequence, and suggests there might not be an inherent meaning to the key, but rather that the keys are likely to have their key-value association altered if other mutable operations happen, like how remove updates all the "keys" after a certain point.

[^2]: in particular, having an idea of what the mapping type is, and the key, it becomes obvious what the value's meaning is and why we have to get it.

kennytm commented 3 months ago

if you see some x.get() that tells you jack shit about what's coming next.

Except that you can't write x.get() in this case, it has to be LazyLock::get(&x) and that gives you a lot of context.

workingjubilee commented 3 months ago

That is true. I am nonetheless irked by NonZeroU32 and the like.

Anyways.

kennytm commented 3 months ago

Forfeiting method chaining is caused by LazyLock implementing Deref<Target = T>, it is irrelevant to what the function should be named.

Anyway I think it should not be called get. I expect the unwrap-like get() to be an infallible method returning T or its reference/pointer i.e. works like force. The method proposed here sounds more like Mutex::try_lock, so IMO the name should also be like try_xxxx (unfortunately try_force is oxymoron).

workingjubilee commented 3 months ago

try_me, clearly /j

workingjubilee commented 3 months ago

My brief rant about the overloading of get aside, I note that one of the few functions with an exactly matching signature to this is Option::as_deref, which seems to recommend as_deref or try_deref.

kennytm commented 3 months ago

That's a false comparison, Option<T>::as_deref() can be called as_xxxx because the input is also an Option, in the same situation as Result and Pin. You are trying to get an Option out of a LazyLock here.

workingjubilee commented 3 months ago

huh, so a slight difference in type signature can make the use of another function's name inappropriate, even if it has some similarities? interesting how that works!

programmerjake commented 3 months ago

I would say try_deref is a good name, except that implies that you'd return None/Err if deref would fail/panic instead of the actual operation of returning None if the initialization function hasn't ran yet. maybe deref_if_initialized?

workingjubilee commented 2 weeks ago

Whatever.