Open m-ou-se opened 2 years ago
The arguments in favor of having such a type are contained in the (very long) thread at https://github.com/rust-lang/rust/issues/53639 .
TL;DR: static mut
makes it way too easy to get a mutable reference to a static, which is wildly unsound in the face of concurrency, and having users write their own versions of this type is very error prone (see https://github.com/rust-lang/rust/issues/53639#issuecomment-790091647 for one example). This is an unsafe pattern that should be encapsulated by the stdlib, and might someday lead to the ability to deprecate static mut
.
This would be a great addition to the standard library. Most OS-level crates I've worked on eventually have to use static mut
or implement an abstraction like this if they need to have static mutable memory (such as a bootstrap stack or GDT). One of the downsides of static mut
is that doing pointer math with addresses to static mut
variables has to be marked unsafe
. Even addr_of!(STATIC_VAR)
needs an unsafe
annotation.
This abstractions makes it so raw pointer math would be safe, while use of the pointee to would be unsafe.
A concrete example is from @phil-opp's blog "Writing an OS in Rust" where static mut
is used for the TSS Double Fault stack.
Unresolved Question: Should it be
Sync
only ifT
isSync
, or should it unconditionally beSync
?
I think the former might be the best, if it's unconditionally Sync
then it would be possible to wrap !Send
types (e.g. Rc
or thread local storage values) in it and make it safe to share these between threads. At the very least it should be Sync
if T
is Send
.
- Should it be
Sync
only ifT
isSync
, or should it unconditionally beSync
?
Conceptually, I think T
should be required to be Sync
. And while we're at it, shouldn't it also require T: Send
at the very least, as Pointerbender mentioned?
In practice, I commonly deal with types containing raw pointers (coming from bindgen), which are (very un-ergonomically) !Send
and !Sync
. So unfortunately, I'll have to continue using my own implementation which is unconditionally Sync
, because of raw pointers being unconditionally !Send
and !Sync
.
There has been previous discussion that if a new UnsafeCell
type were to be added, that it should be both Sync
and Copy
. Does that have any bearing on this issue?
In particular, it seems as though stabilizing this without a Copy
impl may complicate or preclude adding one later for the reasons that have made #55207 and #25053 challenging [^1]. If we were to want a version of UnsafeCell
that implements Copy
, is it better to have both SyncUnsafeCell
and CopyUnsafeCell
or better to have one RawUnsafeCell
(a.k.a. (farcically) VeryUnsafeCell
, SuperUnsafeCell
, etc.) that implements both and from which all others can be derived?
As background:
@SimonSapin proposed this primitive here: https://github.com/rust-lang/rust/pull/55207#issuecomment-435467890 https://github.com/rust-lang/rust/issues/53639#issuecomment-435468228
@aturon discussed adding a separate Copy
version of UnsafeCell
here: https://github.com/rust-lang/rust/issues/25053#issuecomment-274262179
@RalfJung analyzed the situation in some detail: https://github.com/rust-lang/rust/issues/25053#issuecomment-317099890 https://github.com/rust-lang/rust/pull/55207#issuecomment-435570632
cc #55207 #25053 @jamesmunns
[^1]: That is, UnsafeCell
could have been Copy
originally, but adding the impl now is hard, just as UnsafeCell
could have been Sync
originally, but at this point we're instead adding a new struct.
In general, in my uses of UnsafeCell-made-sync, which are typically data structures that I am managing thread safety manually, e.g. for thread safe queues or allocators or such, I very much would not want it to be Copy
. I can hunt down some concrete examples if it would be helpful, but I have a lot of UnsafeCell<MaybeUninit<T>>
s in my code for reasons.
I'm not certain if the proposal was something like:
impl<T: Copy> Copy for NewUnsafeCell<T> {}
Which might be more palatable, but it would probably make me nervous that I might ACCIDENTALLY make my types copy, when I really don't want them to be, for example if I happened to use some T: Copy
data in my structures.
I think a lot of mine (and maybe others?) desire for a Copy
version of UnsafeCell
has been abated by the fact that UnsafeCell::new()
is now const, and Mara's "Const Single" pattern works, e.g.
const SINGLE: UnsafeCell<NonCopyType> = UnsafeCell::new(NonCopyType::const_new());
let arr: [UnsafeCell<NonCopyType>; 64] = [SINGLE; 64];
There might be others who have that desire! But you pinged me, and I can't say I've ever wanted it, outside of initializers, where the "Const Single" pattern works reasonably well, even if it isn't completely widely known.
Edit:
As a very quick example, bbqueue
uses an UnsafeCell<MaybeUninit<[u8; N]>>
for a backing storage.
I feel like it would be a footgun to make this structure Copy
, even if MaybeUninit<[u8; N]>
could be, and I think it would, given that [u8; N]
is Copy
.
It looks like I did write a few years ago about a possible SomethingUnsafeCell
that would both Sync
and Copy
, but now I agree with @jamesmunns that it’s probably not a good idea.
SyncUnsafeCell
exists mostly to use in static
to replace static mut
where unintentional accesses can cause UB, whereas always manipulating raw pointers makes it easier to see where they are dereferenced. Making it also Copy
would reintroduce that risk of unintentional (and not necessarily properly synchronized) access to a shared static
.
Oh. I just now realized I opened https://github.com/rust-lang/rust/pull/55207 in 2018, which proposed to make UnsafeCell Copy/Clone (oh how time flies!)
As I mentioned downthread: https://github.com/rust-lang/rust/pull/55207#issuecomment-462147250, I definitely agree that my original goal was mainly around const/array initialization, which is solved by (better IMO) by https://github.com/rust-lang/rust/issues/49147.
Which might be more palatable, but it would probably make me nervous that I might ACCIDENTALLY make my types copy, when I really don't want them to be, for example if I happened to use some T: Copy data in my structures.
That would not happen. You still have to add derive(Copy)
to your type, which of course you won't.
I really don't understand the problem here.
In the example I gave, MaybeUninit<[u8; 64]>
is Copy
because [u8; 64]
is Copy, and MaybeUninit<T>
is Copy
when T
is.
Therefore without adding derive(Copy)
, SuperUnsafeCell<MaybeUninit<[u8; 64]>>
is now Copy
, which was not previously the case for UnsafeCell<MaybeUninit<[u8; 64]>>
, which I might never want to be copy, because it serves as an inner-mutability data store, and any unintentional copies would be a logic error.
I assumed you were talking about a type you defined yourself.
Interior mutability data stores can be perfectly fine with Copy
(making Cell: Copy
would be sound), so it seems like a bad idea to disallow this for everyone and make it outright impossible to have an interior mutable type that is Copy
. That's just a fundamental expressiveness gap that Rust currently has, and not for any good reason IMO.
That's fair. However some of my worst bugs when working with unsafe are when I accidentally create a copy of some data that happens to be Copy (e.g. accidentally copying to the stack when trying to reborrow from a pointer), then returning a reference to a local rather than the non-local pointer. Since UnsafeCell is likely to be used in pointer form (e.g. through the .get()
interface, using it with any T: Copy
data makes me wary.
This is probably a niche issue (and almost certainly only a personal experience report), but that's my $0.02. If there are other compelling use cases, I'll just have to be more careful in the future, or stick with the old UnsafeCell
with a manual Sync
impl.
Interior mutability data stores can be perfectly fine with
Copy
(makingCell: Copy
would be sound), so it seems like a bad idea to disallow this for everyone and make it outright impossible to have an interior mutable type that isCopy
. That's just a fundamental expressiveness gap that Rust currently has, and not for any good reason IMO.
I think this is a really important point. I think this is case where Rust accidentally mixes 'unsafe' with 'just a bad idea'. The best example of that is how mem::forget
is usually a "bad idea", but not unsafe
. From that perspective, UnsafeCell
does it wrong: it is not Sync, and requires you to write unsafe impl Sync
on your type containing it if you want to share it, even though a Sync
UnsafeCell
is perfectly sound. Same for Copy
for Cell
. Same for Send
and Sync
for raw pointers.
At this point we can't just make these types Sync
(or Copy
, etc.) without breaking existing code, but it's interesting to think about how to represent a concept like "this type is technically Sync (or Copy), but it's usually not what you want" in the language.
It reminds me also of some arguments for whether to implement Copy
for Range
or not. It's not unsound, but might lead to mistakes. Whether that means it should or shouldn't be done is debatable.
I support this, although I think it should be explicitly marked as a "static mut
killer" in the documentation.
It's not really a static mut killer as is, since it doesn't relax the Send requirement. When I've wanted to use it this has been an issue (since we have a few types which are !Send & !Sync just as lints).
I do see why this limitation exists, though.
Would it be better to have a generic AlwaysSync<T>
wrapper? Constructing it in general would be unsafe but it could have a safe From<UnsafeCell<T>>
implementation.
In practice, I commonly deal with types containing raw pointers (coming from bindgen), which are (very un-ergonomically) !Send and !Sync. So unfortunately, I'll have to continue using my own implementation which is unconditionally Sync, because of raw pointers being unconditionally !Send and !Sync.
Would it be better to have a generic AlwaysSync
wrapper? Constructing it in general would be unsafe but it could have a safe From<UnsafeCell > implementation.
I would appreciate having something like this. Creating static structs for things like plugins is really quite painful, especially when they contain raw pointers, because you have to wrap everything in a sync-forcing type (e.g. from one of my proc macros, my UnsafeSyncCell
is always Sync
). This type is just an UnsafeCell
transparant wrapper where new
is also unsafe.
I don't think an AlwaysSyncCell
needs to be the same as SyncUnsafeCell
since they have somewhat different goals (SyncUnsafeCell
being more useful for existing Rust types, AlwaysSyncCell
being useful more for statics/synchronization with the C interface). But it would be good to keep in mind for naming/design decisions that both concepts do need to exist, be it in std
itself or just elsewhere in the ecosystem.
Question just since I haven't seen it anywhere - is unsafe impl<T: ?Sized + Sync> Sync for UnsafeCell <T> {}
not possible, or would it somehow be breaking? Or is not having UnsafeCell
ever be sync just a form of lint against unintended use?
Some existing unsafe code might rely on the auto-trait behavior of UnsafeCell<T>: !Sync
(even for T: Sync
) to make their own wrappers !Sync
.
For example, RefCell
doesn’t and has an explicit impl<T: ?Sized> !Sync for RefCell<T> {}
but only to provide better error messages. It would still be sound without it in current Rust.
Another option which I don't see mentioned here is making SyncUnsafeCell
always Sync
but introducing a warning-level lint for using SyncUnsafeCell<T>
with T: ?Sync
That's fair. However some of my worst bugs when working with unsafe are when I accidentally create a copy of some data that happens to be Copy (e.g. accidentally copying to the stack when trying to reborrow from a pointer), then returning a reference to a local rather than the non-local pointer.
While I understand the concerns, this is mostly a matter of discipline and letting the compiler figuring out the necessary reborrowing. Since an *UnsafeCell
is not Deref
, you wouldn't want to *my_cell
except to explicitly use the copy impl if it would exist. If you want to get to the data, you would go through my_cell.get()
or get_raw()
.
To show a use of having a Copy
able cell, I'm currently writing a datastructure Foo<T>
that for ergonomical reasons I want to be Copy
if T: Copy
since the user should pretend as though it's just a plain old T
, but actually keeps track of some "operation"s that are done to that T
behind the scenes. Problematically, I wouldn't be able to observe the copy of these Foo<T>
, so to fix that, I would include a counter in it that increases with every "operation" and when an implicit copy is made, I can see the old counter value in the backend to rewind to where - logically - a copy should have been performed.
#[derive(Clone, Copy)]
struct Foo<T> { ... }
impl<T> Foo<T> {
fn operate(&self); // note *NOT* &mut self. All operations are sequenced in the backend.
}
So I would want to increase the counter in Foo::operate
but that should not require a &mut self
but a &self
. Having that counter in a Cell<usize>
would be nice, but alas Cell
is not Copy
for basically historical reasons. I would really want to have an internally-mutable-yet-copy type although I concede that I do not care as much about Sync
issues, since you can fix those by a manual unsafe impl
.
EDIT: Tl;dr Thinking and reading more about this, I believe that not having a Copy
impl on the *UnsafeCell
makes a lot of sense, but we need to have a way to have some consuming type opt out of the structural check of Copy
and let it implement Copy
despite containing an *UnsafeCell
. That would open up the path towards impl<T: Copy> Copy for Cell<T>
(which is safe and okay as far as I can gather) and solve my issue anway. Sorry, since it seems entirely unrelated to this issue now :)
Feature gate:
#![feature(sync_unsafe_cell)]
This is a tracking issue for SyncUnsafeCell.
Public API
Steps / History
Unresolved Questions
Sync
only ifT
isSync
, or should it unconditionally beSync
?