Closed briansmith closed 4 years ago
I’ll clarify this in the docs, but OnceCell does establish happens before relationship: it wouldn’t be sound otherwise
On Thursday, 2 January 2020, Brian Smith notifications@github.com wrote:
The documentation says that the "Implementation is based on lazy_static and lazy_cell crates and std::sync::Once. In some sense, once_cell just streamlines and unifies those APIs To implement a sync flavor of OnceCell, this crates uses either a custom re-implementation of std::sync::Once or parking_lot::Mutex."
This creates some confusion about how equivalent once_cell::OnceCell is to std::sync::Once. In particular, std::sync::Once guarantees that "When this function returns, it is guaranteed that some initialization has run and completed (it may not be the closure specified). It is also guaranteed that any memory writes performed by the executed closure can be reliably observed by other threads at this point (there is a happens-before relation between the closure and code executing after the return)."
Is this guarantee provided by OnceCell? I don't think you intend it to be provided, but the statements above in the documentation might lead one to think that OnceCell is a reimplementation of std::sync::Once with a different (better) API, when really it implements a similar but more limited (and potentially more efficient) variant.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/matklad/once_cell/issues/83?email_source=notifications&email_token=AANB3M7FOVJ7UJLUVJGQU3DQ3ZTGDA5CNFSM4KCHL352YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IDYTJXQ, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANB3MZUKDX7FTPZGEMESM3Q3ZTGDANCNFSM4KCHL35Q .
Thanks for the report! It was a major oversight in the documentation, as these grantees are really crucial. And of course I am completely wrong claiming that " it wouldn’t be sound otherwise", and you are right that there's a potentially more efficient implementation via consume
is possible. This definitely needs to be added to the RFC.
However, for this crate I want to stick to a conservative Acquire
choice, for two reasons:
consume
is not exposed in the nice way by the current Rust API, and I feel uneasy about faking it via Relaxed
https://github.com/matklad/once_cell/pull/71That said, I think that for std we maybe should spec that the ordering is consume: I would feel better about faking consume via relaxed in std, b/c it is bound to the specific compiler and can rely on implementaiton defined behavior.
I also think that we could add a separate ConsumeOnceCell
, so that the user can opt-out of the additional guarantee, if they really need the best possible performance.
I’ll clarify this in the docs, but OnceCell does establish happens before relationship: it wouldn’t be sound otherwise
Just to clarify: The issue isn't about the ordering guarantees for values stored within the OnceCell
, but rather the ordering guarantees for otherwise-uncontrolled writes that the closure performs (via unsafe code) to memory outside the OnceCell
.
I think OnceCell
should only guarantee that the contents of the cell are synchronized, and we should leave side effect synchronization, when needed, to std::sync::Once
and any #![no_std]
-compatible replacements for it. In most cases where OnceCell
is used, one would want the maximum performance and not have any need for side-effects.
After you ship a release with #85 it will be pretty much impossible to undo that decision without renaming OnceCell
and/or the crate.
The problem is that we can't actually take advantage of this weaker guarantee in current Rust, without involving implementation-defined behavior.
Rust does not provide Ordering::Consume
(which we need to make the code run faster), and, while one can simulate it in sketchy way with Relaxed
and a compiler barrier, this is not guaranteed to work (though it does work in practice). Crossbeam docs sum up the situation nicely:
The exact definition of "depend on" is a bit vague, but it works as you would expect in practice since a lot of software, especially the Linux kernel, rely on this behavior.
I don't want to ship a vaguely correct synchronization primitive as a default in a foundation-level library.
We can of course document weaker guarantees but implement stronger ones, in hope that we'll improve implementation once it becomes possible to do in a language-lawer correct way. However, Consume
is not on the horizon, it's even put on hold in C++, as it was specked wrong.
A plan outlined in https://github.com/matklad/once_cell/pull/71#issuecomment-570551203 seems better, in practice. We ship a vaguely correct, but more efficient implementation as an opt-in module once_cell::consume::OnceCell
. That way, client who care about performance more than about pedantic correctness, could opt-in into extra speed now (and also give valuable feedback about how much it is faster in practice, and if it is correct enough in practice). The suggest default however remains more conservative, so I can sleep well. When/if Consume
becomes an officially supported thing, we issue the 2.0
version of the library, which relaxed guarantee.
For standard library, we can't add a separate feature-flaged module or issue 2.0, so we should stick to a more conservative default. At the same time, standard library can rely on implementation-defined behavior (b/c it is paired with the compiler version), so we would actually be able to ship a more performant implementation in std before sorting out consume formally. That is, the risk I see with using crossbeam's hack right now is that a compiler upgrade may, in theory, break the code. And, in theory, people might be compiling once_cell 1.0.1
with rustc 1.666
some time in the future. With std, this is not a problem, because, if compiler gets smarter we can fix std.
We can of course document weaker guarantees but implement stronger ones, in hope that we'll improve implementation once it becomes possible to do in a language-lawer correct way. However, Consume is not on the horizon, it's even put on hold in C++, as it was specked wrong.
That's my preference. However, I think the choice you've chosen also makes sense.
I agree it is risky, too risky, to implement this with a hack.
The documentation says that the "Implementation is based on lazy_static and lazy_cell crates and std::sync::Once. In some sense, once_cell just streamlines and unifies those APIs To implement a sync flavor of OnceCell, this crates uses either a custom re-implementation of std::sync::Once or parking_lot::Mutex."
This creates some confusion about how equivalent
once_cell::OnceCell
is tostd::sync::Once
. In particular,std::sync::Once
guarantees that "When this function returns, it is guaranteed that some initialization has run and completed (it may not be the closure specified). It is also guaranteed that any memory writes performed by the executed closure can be reliably observed by other threads at this point (there is a happens-before relation between the closure and code executing after the return)."Is this guarantee provided by
OnceCell
? I don't think you intend it to be provided, but the statements above in the documentation might lead one to think thatOnceCell
is a reimplementation ofstd::sync::Once
with a different (better) API, when really it implements a similar but more limited (and potentially more efficient) variant.