Tracking Issue for `UniqueRc`/`UniqueArc`

rust-lang / rust

Empowering everyone to build reliable and efficient software.

https://www.rust-lang.org

Other

96.98k stars 12.54k forks source link

Tracking Issue for `UniqueRc`/`UniqueArc` #112566

Open eholk opened 1 year ago

eholk commented 1 year ago

Feature gate: #![feature(unique_rc_arc)]

This is a tracking issue for UniqueRc and UniqueArc, as discussed in https://github.com/rust-lang/libs-team/issues/90.

This feature supports creating cyclic data structures by creating an RC that is guaranteed to be uniquely owned. The uniquely owned RC can then be converted to a regular RC. While uniquely owned, we can freely modify the contents of the UniqueRc/UniqueArc (i.e. they implement DerefMut). Weak pointers to the object can be made, but upgrading these will fail until it has been converted into a regular RC.

Public API

let rc = UniqueRc::new(42);
let weak = UniqueRc::downgrade(&rc);
assert!(weak.upgrade().is_none());

let _rc = UniqueRc::into_rc(rc);
assert_eq!(*weak.upgrade().unwrap(), 42);

Steps / History

[x] Implementation:
- [x] UniqueRc: #111849
- [ ] UniqueArc
[ ] Final comment period (FCP)^1
[ ] Stabilization PR

Unresolved Questions

Do we also want EmptyRc/EmptyArc?
Are we concerned about confusion around the existing is_unique methods? Could we add this functionality to Rc/Arc/ instead? (comment)
Do we want consistency with Linux Kernel UniqueRc? (comment)

programmerjake commented 1 year ago

typo: "these will fail it has been converted" should be "these will fail until the Unique* has been converted"

eholk commented 1 year ago

typo: "these will fail it has been converted" should be "these will fail until the Unique* has been converted"

Thanks, I added the word "until" to make it read a little better.

ebkalderon commented 1 year ago

I wonder whether it would be desirable to also add a fallible method which converts the other way, from an Rc<T> back into UniqueRc<T> provided there is only one strong reference, as part of this ticket? Has this been considered, and if so, are there any drawbacks?

eholk commented 1 year ago

I wonder whether it would be desirable to also add a fallible method which converts the other way, from an Rc<T> back into UniqueRc<T> provided there is only one strong reference, as part of this ticket? Has this been considered, and if so, are there any drawbacks?

Good question. I think it'd help to see a concrete use case for that feature. I'm curious how often in practice you can have enough confidence that an Rc is uniquely owned so that it would make sense to convert it into a UniqueRc. Maybe for some kind of manual deinitialization process? Or maybe there are programs that have a structure that is shared for a phase of the program (e.g. spawn a bunch of threads to process it in parallel) and then goes back to being a single owner (e.g. the threads all shut down after their work is finished).

camsteffen commented 1 year ago

Name idea: RootRc

As I understand it, this struct is the "root" owner of a ref-counted thing, off of which you can "branch off" Weak Rc's.

tgross35 commented 1 year ago

At some point you may want to add support for the allocator API (after #89132 lands in a few hours)

MatrixDev commented 1 year ago

I got here from https://github.com/rust-lang/libs-team/issues/90 so maybe I'm missing something.

I see one major downside here (at least from my usecases). Most of the time when I need cyclic dependencies - it's when they should be em... cyclic.

I assume that idea for using UniqueRc for circular dependencies is following:

mod child {
    pub struct Child {
        parent: Weak<super::Parent>,
    }

    impl Child {
        pub fn new(parent: Weak<super::Parent>) -> Option<UniqueRc<Self>> {
            Some(Self {
                parent,
            })
        }
    }
}

struct Parent {
    child: Rc<child::Child>,
}

impl Parent {
    fn new() -> Option<Rc<Self>> {
        let child = child::Child::new(Weak::new())?;
        Rc::new_cyclic(|weak| {
            child.parent = weak.clone(); // ERROR: parent cannot be updated, field is private
            Self {
                child: UniqueRc::into_rc(child),
            }
        })
    }
}

As you can see it doesn't help us in such case. And exposing parent field is not always desirable. Another problem here is that it is almost impossible to update parent field if it is propagted to the children of the child and so on. Also by this time they are most probably were already converted from UniqueRc to Rc.

Maybe idea is to use it other way around:

fn new() -> Option<Rc<Self>> {
    let rc = UnqueRc::new(Self { /* no way to initialize without client instance */ });

    rc.child = child::Child::new(UniqueRc::downgrade(&rc))?;

    Some(UniqueRc::into_rc(rc))
}

But as you can see we can't even create Parent this way... We could wrap child in Option, but it will have huge impact on the usability of this field. So basically I don't see how it should help with circular dependencies in non-trivial cases.

I like the idea of UniqueRc and it has its place but IMHO not as a solution for circular dependencies.

I'd much rather suggest adding something like MaybeRc<T> (naming similar to MaybeUninit) that can be later materialized into Rc<T>. This would solve all problems with current cyclic dependencies that at very least I see in my usecases.

fn new() -> Result<Rc<Self>, Error> {
    let rc = MaybeRc::<Self>::new();
    let weak = rc.downgrade();

    let child = Child::new(weak)?;

    Ok(rc.materialize(Self {
        child,
    }))
}

SkiFire13 commented 1 year ago

I'd much rather suggest adding something like MaybeRc<T> (naming similar to MaybeUninit)

FYI MaybeUninit is named like this because it could be uninitialized, but not necessarily. Your proposed MaybeRc<T> instead is always uninitialized because the moment you initialize it you have to consume it, getting back a Rc. So it has the Uninit part of MaybeUninit, but not the Maybe. UninitRc/EmptyRc seem a more fitting name to me.

MatrixDev commented 1 year ago

UninitRc/EmptyRc seem a more fitting name to me.

I agree that MaybeRc is not the best name but don't like EmptyRc either. Maybe LateRc or LazyRc?

finnbear commented 1 year ago

I wonder whether it would be desirable to also add a fallible method which converts the other way, from an Rc<T> back into UniqueRc<T> provided there is only one strong reference, as part of this ticket? Has this been considered, and if so, are there any drawbacks?

Good question. I think it'd help to see a concrete use case for that feature. I'm curious how often in practice you can have enough confidence that an Rc is uniquely owned so that it would make sense to convert it into a UniqueRc. Maybe for some kind of manual deinitialization process? Or maybe there are programs that have a structure that is shared for a phase of the program (e.g. spawn a bunch of threads to process it in parallel) and then goes back to being a single owner (e.g. the threads all shut down after their work is finished).

I have a Yew web application that shares state via an Rc to avoid cloning large structs. I currently use a hacky workaround which was later pointed out to be unsound. Converting an Rc to a UniqueRc is exactly what I need, as the structure of my program guarantees that the strong ref count will be 1 when I need to do mutations.

RalfJung commented 1 year ago

(Mirroring my comment from the closed ACP)

Comparing what is implemented here with the original proposal, there seems to be one downside: with UniqueRc, one has to first set up the Gadget in an invalid state, and then close the loop. Doesn't that share the downsides of what was described in the "Multi-phase Construction" proposal in the ACP?

LHolten commented 10 months ago

I would like to ask: what is the advantage of UniqueRc over UninitRc (or EmptyRc)?

As mentioned, UniqueRc is not as useful as it could be, because it needs a placeholder value during initialisation of recursive datastructures. This means for example that UniqueRc can not always replace usage of Rc::new_cyclic.

UninitRc takes its value only when constructing the Rc and thus does not need a placeholder value. Therefore UninitRc can be used to implement Rc::new_cyclic and can replace its usage when needed.

Proof that UninitRc is at least as general as UniqueRc by using it to implement UniqueRc:

struct UniqueRc<T> {
    value: T,
    empty: UninitRc<T>
}

impl<T> UniqueRc<T> {
    pub fn new(value: T) -> UniqueRc<T> {
        Self { value, empty: UninitRc::new() }
    }

    pub fn downgrade(this: &UniqueRc<T>) -> Weak<T> {
        UninitRc::downgrade(&this.empty)
    }

    pub fn into_rc(this: UniqueRc<T>) -> Rc<T> {
        UninitRc::init(this.empty, this.value)
    }
}

Proof that UninitRc can be used to implement Rc::new_cyclic.

impl<T> Rc<T> {
    pub fn new_cyclic<F>(data_fn: F) -> Rc<T>
        where
            F: FnOnce(&Weak<T>) -> T {

        let uninit = UninitRc::new();
        let value = (data_fn)(&UninitRc::downgrade(&uninit));
        UninitRc::init(uninit, value)
    }
}

frank-king commented 8 months ago

For UninitRc<T>, how about using UniqueRc<MaybeUninit<T>> instead?

We can add a similar method write like Box::<MaybeUninit<T>>::write which fits the function of Uninit::<T>::init:

impl<T> UniqueRc<MaybeUninit<T>> {
    pub fn write(rc: UniqueRc<MaybeUninit<T>>, value: T) -> UniqueRc<T> { /* ... */ }
}

LHolten commented 8 months ago

For UninitRc<T>, how about using UniqueRc<MaybeUninit<T>> instead?

We can add a similar method write like Box::<MaybeUninit<T>>::write which fits the function of Uninit::<T>::init:
impl<T> UniqueRc<MaybeUninit<T>> {
    pub fn write(rc: UniqueRc<MaybeUninit<T>>, value: T) -> UniqueRc<T> { /* ... */ }
}

While this is a good start it does not remove the need for UninitRc<T>. This is because UninitRc<T>::init is only half the API of UninitRc<T>, where the other half is UninitRc<T>::downgrade. UninitRc<T>::downgrade has type fn(&UninitRc<T>) -> Weak<T> and is only sound because the UninitRc<T> has to be initialized before its weak reference can be upgraded. This is why implementing fn(&UniqueRc<MaybeUninit<T>>) -> Weak<T> would be unsound.

Here is what I think is a sound implementation of UninitRc<T> on top of UniqueRc<MaybeUninit<T>>. It uses unsafe twice and inner is private to make the public API sound. UniqueRc<MaybeUninit<T>>::write would remove one of the two unsafe blocks.


pub struct UninitRc<T> {
    inner: UniqueRc<MaybeUninit<T>>,
}

impl<T> UninitRc<T> {
    pub fn new() -> Self {
        UninitRc {
            inner: UniqueRc::new(MaybeUninit::uninit()),
        }
    }

    pub fn downgrade(&self) -> Weak<T> {
        let weak = UniqueRc::downgrade(&self.inner);
        // SAFETY: the weak reference can only be upgraded after initialization
        unsafe { transmute(weak) }
    }

    pub fn init(mut self, val: T) -> Rc<T> {
        self.inner.write(val);
        let rc = UniqueRc::into_rc(self.inner);
        // SAFETY: the rc has been initialized
        unsafe { transmute(rc) }
    }
}

SveOls commented 5 months ago

I have used the above implementation of UninitRc in my private projects, and it’s very useful.

However, as someone on Reddit pointed out to me, transmuting Rc and Weak is unsound. There’s another method for Rc<MaybeUninit> that does the same thing soundly in the feature new_uninit. There’s no equivalent for Weak, so that has to be implemented.

https://doc.rust-lang.org/std/rc/struct.Rc.html#method.assume_init

#![feature(new_uninit)]
pub fn init(mut self, val: T) -> Rc<T> {
    self.inner.write(val);
    let rc = UniqueRc::into_rc(self.inner);
    // SAFETY: the rc has been initialized
    unsafe { rc.assume_init() }
}

LHolten commented 5 months ago

@SveOls could you please explain why transmuting Rc and Weak is unsound in this case or link me to some resources?

zachs18 commented 5 months ago

(This is somewhat off-topic, but) @LHolten Not sure if this is what SveOls meant, but generally unless explicit guarantees about layout/representation are listed in the documentation, it is not considered sound to transmute between different type-generic instantiations of a type. Box has such a guarantee (i.e. it is guaranteed to be represented as *mut T (at least for T: Sized)), but Rc/Weak/etc do not (currently) an explicit guarantee about their representation.

One way to soundly "transmute" Rc<T> to Rc<U> is using Rc::into_raw and Rc::from_raw.

Rc::from_raw doc excerpt

> The raw pointer must have been previously returned by a call to [Rc::into_raw](https://doc.rust-lang.org/std/rc/struct.Rc.html#method.into_raw) where U must have the same size and alignment as T. This is trivially true if U is T. Note that if U is not T but has the same size and alignment, this is basically like transmuting references of different types. See [mem::transmute](https://doc.rust-lang.org/std/mem/fn.transmute.html) for more information on what restrictions apply in this case. (The `Weak::from_raw` docs currently do not say this, but I think that's a "docs bug").

RalfJung commented 5 months ago

@LHolten in other words -- by default, all transmutes are unsound; it is up to you as user of transmute to find documentation which says it is sound.

SveOls commented 4 months ago

Yep, you're right, @zachs18, that's what I meant. I'll make sure to be more explicit in the future.

On another note, my interpretation of the drop implementation of Weak says that it never drops T. When the strong count reaches 0, T is dropped, but the memory remains, and when the weak count reaches 0, the memory is freed. Meaning that casting Weak<MaybeUninit> to Weak should be okay, as it won't call drop on uninitialized memory. That, combined with a more sound way to cast Rc<MaybeUninit> and Weak<MaybeUninit> to Rc and Weak, should cover the issues with soundness.

Rua commented 2 months ago

I noticed that this has an implicit Sized bound. Does it need to?

zachs18 commented 2 months ago

I noticed that this has an implicit Sized bound. Does it need to?

Nope, looks like it was recently relaxed in https://github.com/rust-lang/rust/pull/126285

tommie commented 1 month ago

Newbie to Rust. How does this compare to a Box?

IIUC, the point is that the wrapper metadata is larger with Rc, and thus a simple Box::into_rc wouldn't be possible. A Box::new_for_rc could do that, but then it might as well be a new data structure called UniqueRc... (Alternatively, Box version 2 could have a parameterized header, allowing implementing UniqueRc on top of it.)

I.e. the only use for this is as a precursor for an Rc, and in all other cases of unique ownership, a Box is the right choice.

Am I understanding the lay of the land correctly?

The docs currently state

A common use case is to have an object be mutable during its initialization phase but then have it become immutable and converted to a normal Rc.

and there is no mention of Box. I suggest making it more decisive about the intended/only use-case to avoid confusion for those who just happen to stumble upon it while wondering which data structure to wrap their objects in to make the code compile. :)

SkiFire13 commented 1 month ago

A Box::new_for_rc could do that

Box::new_for_rc would not create a Box<T> usable for a Rc<T>, at best it would create a Box<WithRcHeader<T>> usable for that, but:

it also requires a new type, so we might as well make a UniqueRc type;
it exposes some implementation details of Rc, which ideally should be avoided if possible.

Alternatively, Box version 2 could have a parameterized header

If you want a Box with header you can just use Box<WithHeader<T>>, I don't see why you would need a new version of Box. The header type will still need to be a parameter of the Box, so might as well put it were it should be, with the content of the Box. Also, I don't think Box "version 2" will ever be a thing.

Ultimately the usecase for this is "I want something that can be converted to a Rc with a noop but initially I want unique ownership of it". Box isn't mentioned because the usecase has nothing to do with Box.

tommie commented 1 month ago

Ultimately the usecase for this is "I want something that can be converted to a Rc with a noop but initially I want unique ownership of it". Box isn't mentioned because the usecase has nothing to do with Box.

The problem to me is that the docs assume you already know this, and it doesn't rule out UniqueRc being a contender for "something that has uniquely known ownership." In C++, this is called std::unique_ptr, so coming from there, it's not obvious that UniqueRc is the wrong tool in general.

I think it would be nice to guide readers who need simple ownership to Box, the same way Cow links to Rc::make_mut.

And replace the non-excluding use-case explanation in the docs:

A common use case is to have an object be mutable during its initialization phase but then have it become immutable and converted to a normal Rc.

with the very concrete, specific use-case, statement you just gave here:

I want something that can be converted to a Rc with a noop but initially I want unique ownership of it

dtolnay commented 1 month ago

UniqueRc<T> could be supplanted most likely not by Box<WithRcHeader<T>>, but instead Box<T, RcAllocator<A>>. I have talked about this previously in https://github.com/rust-lang/rust/issues/126799#issuecomment-2183766297.