Closed KodrAus closed 1 year ago
If you have a
&mut OnceCell<T>
during initialization why would you needOnceCell
at all? If you only have&OnceCell<T>
, any method that returns&mut T
from that would be unsound. Only&T
ever being accessible through&OnceCell<T>
is a basic principle ofOnceCell
.
My use case is where initialization can be expensive so it should only happen once but updates can happen often and are usually pretty cheap.
If there’s shared access and there can be any update after initialization, OnceCell
does not work. That’s why it’s called "once". You need a RefCell
or a Mutex
or a RwLock
instead.
Allowing mutation after initialization in OnceCell
would make it unsound. For example:
let mut cell = OnceCell::new();
let a: &mut Option<String> = cell.get_mut_with(|| Some(String::from("foo")));
let b: &Option<String> = cell.get(); // `&T` aliases `&mut T`
let c: &str = b.as_deref().unwrap()
*a = None; // String is deallocated
println!("{c}") // Use-after-free
It is possible to implement it completely outside of std
:
fn get_mut_with<T, F: FnOnce() -> T>(cell: &mut OnceCell<T>, f: F) -> &mut T {
cell.get_or_init(f);
cell.get_mut().unwrap()
}
It's safe, because we're taking exclusive reference here.
But I still don't know if it's that useful.
If there’s shared access and there can be any update after initialization,
OnceCell
does not work. That’s why it’s called "once". You need aRefCell
or aMutex
or aRwLock
instead.Allowing mutation after initialization in
OnceCell
would make it unsound. For example:let mut cell = OnceCell::new(); let a: &mut Option<String> = cell.get_mut_with(|| Some(String::from("foo"))); let b: &Option<String> = cell.get(); // `&T` aliases `&mut T` let c: &str = b.as_deref().unwrap() *a = None; // String is deallocated println!("{c}") // Use-after-free
The call to cell.get()
to assign to b
wouldn't work because cell
is mutably borrowed.
So get_mut_with
would take &mut self
? But if you can afford exclusive access for initialization, why use a cell at all?
// regular initialization
let mut value;
value = 42;
let n: &mut i32 = &mut value;
// fallible initialization
let mut value;
if all_is_well {
value = 42;
} else {
return Err("oh no!")
}
let n: &mut i32 = &mut value;
So
get_mut_with
would take&mut self
? But if you can afford exclusive access for initialization, why use a cell at all?
If initialization is expensive or not idempotent, I want to ensure that it only happens once.
@Person-93 perhaps you are looking for https://doc.rust-lang.org/stable/std/option/enum.Option.html#method.get_or_insert_with? Otherwise, I'd say https://github.com/rust-lang/rust/issues/74465#issuecomment-1060033519 is the way to go.
As a policy question, I would probably suggest to not discuss additional convenience methods in this issue, but rather direct issue requests at once_cell crate.
Here, I feel we can benefit most from focusing on the minimal useful subset we can stabilize first. After we have a stable base, it would be much more efficient to open follow-up issues/pr expanding the API surface.
I ... have a new naming proposal:
OnceCell
for !Sync
versionOnceLock
for Sync
versionI think I really like OnceLock:
RwLock
That sounds great. I'm wondering if this means they shouldn't live in the {core, std}::lazy
module, but instead in core::cell
and std::sync
.
Agreed, I like it. (And I am getting to this, just slowly! Someone with more free time is welcome to beat me to it, of course. :) )
I will say that std::sync
feels like sort of a miscellaneous grab bag, but std::sync::OnceLock
doesn't feel terrible.
It might be confusing to have std::sync::{Once, OnceLock}
next to each other, because the difference is not really obvious from their names alone. At the very least, docs should anticipate this confusion and try to clarify their roles -- OnceLock
is roughly just (Once, T)
.
A small point against the OnceLock
name is that you don't have guards like Mutex
or RwLock
. In that sense, you never actually "lock" the data inside.
Alternate name idea: InitLock
-- as in, only initialization is actually locked, and then it has unguarded access.
(I kind of like InitCell
for the !Sync
version too... These are data wrappers, but the only thing that's "once" about that data is its initialization, so it makes some sense to make that more direct in the name.)
@cuviper Another way to look at it is that Once
is OnceLock<()>
, which is my personal preference because it means we can deprecate Once
altogether.
If you mean soft-deprecation in docs, why not. But making Once
emit a deprecation warning to push its existing users to migrate to OnceLock
would create churn. For what benefit? Since Once
is #[stable]
so we can’t actually remove it.
If you mean soft-deprecation in docs, why not. But making
Once
emit a deprecation warning to push its existing users to migrate toOnceLock
would create churn. For what benefit? SinceOnce
is#[stable]
so we can’t actually remove it.
I know we can't really remove it from the compiled library, but I still do like the idea of eventually removing the ability to call deprecated APIs in newer editions. The benefit in my mind being keeping the language surface area as small as possible. I think there is a significant cost in having multiple ways to do the same thing, if people can use Once
or OnceLock<()>
to do the exact same thing, it benefits them for us to push everyone into using the newer more powerful API so you don't end up with people learning old dialects and having to eventually unlearn the old API in favor of the new one once they eventually discover it and become aware of the similarities.
I wouldn't want to introduce this churn carelessly, we'd need to have mechanical ways to update code when they move to the new edition so the only churn that happens is in the diff rather than requiring repetitive manual updates across the entire ecosystem. But between a small amount of diff churn and having to endlessly maintain two ways to do the same thing, I'd choose the diff churn and the smaller library API every time.
How about making std::sync::Once
Once<T = ()>
?
But between a small amount of diff churn and having to endlessly maintain two ways to do the same thing, I'd choose the diff churn and the smaller library API every time.
We still do have to maintain the old stuff, even if it's inaccessible from newer editions.
How about making
std::sync::Once
Once<T = ()>
?
That is unfortunately a breaking change. Generic param defaults only work in some cases but not others. Don't remember exactly when they do and when they don't work though.
If deprecation is tied to an edition and there’s a rustfix migration, that sounds fine. It’s new deprecation warnings in existing code just by upgrading the toolchain that I feel are probably not worth the unexpected churn if it’s only to simplify std learnability.
That is unfortunately a breaking change. Generic param defaults only work in some cases but not others. Don't remember exactly when they do and when they don't work though.
It wouldn't break anything if we just change the struct, but we can't change new
from impl Once
to impl<T> Once<T>
because default type parameters won't affect type param fallback.
What's the status on this? I wanted to use it for something that generates strings for internationalization, but it's still unstable. Is this going to become stable soon?
This is blocked on implementation work. I feel this can become stable fairly soon, if this issue gets someone who'd champion remaining implementation&stabilization work. So far this didn't happen.
(But you can use the once_cell crate in the mean time?)
(But you can use the once_cell crate in the mean time?)
True. The use case involves Egui, the menu system often used for games. On every frame, it goes through the items it needs to display. Those are &str parameters. If I add internationalization, there's a lookup for every menu item on every frame. Hence the need for some kind of memoization.
This is blocked on implementation work.
Can someone outline what implementation work is needed? This issue has quite a few of comments and it's somewhat hard to get what is needed to be done :')
I'd start with the following plan, adjusting according to reality:
mod cell {
OnceCell,
CellLazy,
}
mod sync {
OnceLock,
LockLazy,
}
Re-read the disucssion here and on the RFC to make sure nothing is missed.
Take a critical look at the naming, signatures and semantics of OnceX
methods, update the summary comment in the tracking issue.
Take the minimal orthogonal set of methods of OnceX
, start FCP to stabilize thouse under base_once_cell
feature flag
After OnceX
types are stabilized, move forward with stabilizing Lazy
types (I'd expect that to need a bit more bikeshed
due to less clear naming and objectonable default parameter trick).
Throughout: keep an eye on what's the current blocked, poke relevant people and otherwise ensure that the work doesn't get stuck.
Why CellLazy
instead of LazyCell
(same for lock)?
It is harder to pronounce (triple consecutive L) and also all other cells are suffixes instead of prefixes.
Why
CellLazy
instead ofLazyCell
(same for lock)?
LazyCell
is also more coherent with RefCell
....
From a usability/ergonomics perspective, is there any chance of adding a set(value: T)
method to Lazy<T>
that would be used if-and-only-if the lazy value hasn't already been initialized with the constructor-provided lazy init method (panicking/failing otherwise)?
The idea is that you could specify a default value and override it "in-place" rather than needing to use a OnceCell<T>
and a separate mut T
instance (or a third, default const T
instance) to hold the default value with the option for a user/environment/runtime-provided alternative taking its place, while still only initializing once.
e.g. in lieu of the following:
static DEST_PORT OnceCell<u16> = OnceCell::new();
pub fn main() {
// This is the default value, used if no alternative is provided
let mut dest_port: u16 = 1024;
...
if let Some(arg) = std::env::args().skip(1).next() {
dest_port = arg.parse().expect("Expected a valid u16 port number as the argument");
}
DEST_PORT.set(dest_port).unwrap();
}
you would be able to do something like this:
// This is the default value, used if no alternative is provided
static DEST_PORT Lazy<u16> = Lazy::new(|| 1024u16);
pub fn main() {
if let Some(arg) = std::env::args().skip(1).next() {
let dest_port = arg.parse().expect("Expected a valid u16 port number as the argument");
DEST_PORT.set(dest_port).unwrap();
}
// At some later point:
let foo: u16 = *DEST_PORT;
}
@KodrAus
Can you update the initial post with the updated API please?
Naming. I'm ok to just roll with the
Sync
prefix likeSyncLazy
for now, but have a personal preference forAtomic
likeAtomicLazy
.
I would vote in favor of sync
module. core::lazy
has the unsync implementation and core::lazy::sync
has the sync ones. This is also consistent with the naming used for Arc
, Mutex
, etc -- those are also in a module named sync
@mqudsi you could do something like this:
static DEST_PORT Lazy<u16> = Lazy::new(||
match std::env::args().skip(1).next() {
Some(arg) => arg.parse().expect("Expected a valid u16 port number as the argument"),
_ => 1024,
}
);
pub fn main() {
let port = *DEST_PORT;
}
Edit: had a better idea
@Iron-E yes, it's possible directly with OnceCell
but you're back to using a helper function to wrap the get_or_init()
logic, which is what Lazy<T>
is supposed to help avoid.
See my edit, I had a better idea
static DEST_PORT Lazy<u16> = Lazy::new(||
match std::env::args().skip(1).next() {
Some(arg) => arg.parse().expect("Expected a valid u16 port number as the argument"),
_ => 1024,
}
);
Thanks, but that's frankly not a better idea at all. The example was contrived, in practice you'll have a full command line arg handler processing perhaps dozens of switches, with proper error handling and early abort w/ usage information. The solution isn't to move everything into the Lazy
closure; you normally want as little in there as you can get away with.
in practice you'll have a full command line arg handler processing perhaps dozens of switches, with proper error handling and early abort w/ usage information.
True… but if you have so many arguments it's probably better to use something like clap::Parser
anyway. Lazy::new(|| Args::parse())
could probably do whatever is needed and more.
I don't see any issue with Lazy::set
fwiw (not that my opinion is that important!), I just wanted to show how one could already accomplish similar effects by moving some code around.
The solution isn't to move everything into the Lazy closure; you normally want as little in there as you can get away with.
Agree to disagree I suppose. The two main reasons I can see that one would want something to be Lazy
is that you either:
static
that depends on runtime computation (e.g. a String
).Putting as much work in the Lazy
as possible is better for the first case, but is not strictly better for the latter.
Hello :) I am sorry if this is not the right place to post this, but could you help me figure out what is the current API? Importing Lazy as such seems to not work anymore. For reference, I am using rustc 1.64.0-nightly
. Thank you very much beforehand
#![feature(once_cell)]
use std::lazy::Lazy;
Read the comments above or look at the documentation (it has a search feature).
std::lazy::Lazy
got renamed and moved to std::cell::LazyCell
.
When you search the docs it brings up the former :)
The docs you linked say
Version 1.62.0 (a8314ef7d 2022-06-27)
If you use a nightly compiler (for nightly features) you need nightly docs.
Thank you very much, I couldn't find it. I'm so sorry for the disturbance, I should've looked at the nightly docs
Attempting to use core::lazy::OnceCell
in a kernel project is throwing an “unresolved import” error despite the documentation stating that it’s in the core library, forcing me to fall back on the conquer_once
implementation at the moment — any idea when this will get fixed?
@kennystrawnmusic
If you look a few comments up, you will see that it was renamed to core::cell::LazyCell
.
Since you are running nightly, you can either look at your local docs or the nightly docs on doc.rust-lang.org to find that it is there.
@kennystrawnmusic
If you look a few comments up, you will see that it was renamed to
core::cell::LazyCell
.Since you are running nightly, you can either look at your local docs or the nightly docs on doc.rust-lang.org to find that it is there.
That's what core::lazy::Lazy
got renamed to but I didn't know that OnceCell
also got moved to core::cell
. For some weird reason though the OnceCell
implementation in the core library doesn't like being used in constants since it still uses UnsafeCell
behind the scenes, thus still forcing me to use conquer_once::spin::OnceCell
for my use case (namely, initializing my very own printk
crate) for the time being. If there's a way to make core::cell::OnceCell
as thread-safe as conquer_once::spin::OnceCell
I'd like to know how.
FYI: SyncLazy can be found in std::sync::LazyLock
now
@kennystrawnmusic The problem is that in no_std
there's no way to block (other than spinlooping), so anything that blocks, like OnceLock::get_or_init, can't exist in core
.
A non-blocking but Sync version of a oncecell/oncelock is one that doesn't block, but lets multiple threads race for initialization, like this: https://docs.rs/once_cell/latest/once_cell/race/index.html
ISTM that since OnceCell has a take method which resets it back to the initialised state, it can be set multiple times and therefore 'once' in the name is misleading. Am I missing something about the once-ness?
I think this is just a special case of the more general phenomenon of "nothing is immutable, b/c you can always overwrite the variable itself". This is true of the Once
we already have:
fn main() {
let mut x = std::sync::Once::new();
x.call_once(|| println!("hello"));
x = std::sync::Once::new();
x.call_once(|| println!("hello"));
}
This is also similar to the docs for Mutex
, which currently say
The data can only be accessed through the RAII guards returned from lock and try_lock, which guarantees that the data is only ever accessed when the mutex is locked.
This is the right intuition, but also a lie, because you can do whatever if you have &mut Mutex
.
Admittedly, this is a common question which comes up from time to time https://github.com/matklad/once_cell/issues/153.
My suggestion would be to stick with Once
terminology, as it builds the right intuition, and put the details to the &mut
taking methods.
Should the set
method return Result<&T, T>
rather than Result<(), T>
?
This is a tracking issue for the RFC "standard lazy types" (rust-lang/rfcs#2788). The feature gate for the issue is
#![feature(once_cell)]
.Unstable API
Steps
Unresolved Questions
Inlined from #72414:
Sync
prefix likeSyncLazy
for now, but have a personal preference forAtomic
likeAtomicLazy
. Resolved in: https://github.com/rust-lang/rust/issues/74465#issuecomment-1098359963. Surprisingly, after more than a year of deliberation we actually found a better name.std::sync
types that we might want to just avoid upfront forstd::lazy
, especially if that would align with a futurestd::mutex
that doesn't poison. Personally, if we're adding these types tostd::lazy
instead ofstd::sync
, I'd be on-board with not worrying about poisoning instd::lazy
, and potentially deprecatingstd::sync::Once
andlazy_static
in favour ofstd::lazy
down the track if it's possible, rather than attempting to replicate their behavior. cc @Amanieu @sfackler.SyncOnceCell::get
blocking. There doesn't seem to be consensus in the linked PR on whether or not that's strictly better than the non-blocking variant. (resolved in https://github.com/rust-lang/rust/issues/74465#issuecomment-663414310).Release/Acquire
, but it could also use the elusive Consume ordering. Should we spec that we guaranteeRelease/Acquire
? (resolved as yes: consume ordering is not defined enough to merit inclusion into std)SyncOnceCell
in no_std. I think there's consensus that we don't want to include "blocking" parts of API, but it's unclear if non-blocking subset (get+set) would be useful. (resolved in https://github.com/rust-lang/rust/issues/74465#issuecomment-725360596).get_or[_try]_init
the best name? (resolved as yes in https://github.com/rust-lang/rust/pull/107184)Implementation history
68198 (closed in favor of #72414)
72414 initial imlementation
74814 fixed
UnwindSafe
bounds