rust-lang / libs-team

The home of the library team
Apache License 2.0
115 stars 18 forks source link

Add `shallow_clone()` method to Cow which always reborrows data #283

Closed getreu closed 10 months ago

getreu commented 10 months ago

Proposal

Problem statement

When working with nom and tpnote it is crucial to avoid allocation as much as possible. I find myself frequently with a &Cow<str> I tend to clone() because I need a Cow. The clone() method copies by default it's owned contents, although it is not always necessary. This proposal suggests a shallow clone, that never allocates memory.

Motivating examples or use cases

use crate::tpnote_lib::clone_ext::CloneExt;
use std::borrow::Cow;
fn do_something_or_nothing(v: Cow<str>) -> Cow<str> {
     if v.len() > 3 {
         let s = "Hello ".to_string() + &*v;
         Cow::Owned(s)
     } else {
        v
     }
 }
 // Sometimes, we only have a `&Cow`, but we need a `Cow`!
 let a: &Cow<str> = &Cow::Owned("world!".to_string());
 let b: Cow<str>  = a.shallow_clone();
 assert_eq!(do_something_or_nothing(b), "Hello world!");

 let a: &Cow<str> = &Cow::Owned("ld!".to_string());
 let b: Cow<str>  = a.shallow_clone();
 assert_eq!(do_something_or_nothing(b), "ld!");

In the examples above you can replace .shallow_clone() with .clone() and get the same result with the difference, that .clone() might allocate memory whereas .shallow_clone() never does.

Solution sketch

use std::borrow::Cow;

pub(crate) trait CloneExt<'b> {

impl<'b> CloneExt<'b> for Cow<'b, str> {
    fn shallow_clone(&'b self) -> Cow<'b, str> {
        match *self {
            Self::Borrowed(b) => Self::Borrowed(b),
            Self::Owned(ref o) => Self::Borrowed(o.as_ref()),
        }
    }
}

Todo: generics

Alternatives

Modify the clone() method, that the compiler decides, if a copy or reborrow is necessary. This depends on the lifetime situation of the original and the clone. The downside of this alternative is, that the user has no means to avoid explicitly memory allocation. The proposed solution is sometimes restrictive because the original must outlive the clone. In my use cases, the restriction can be bypassed easily.

Links and related work

A more general solution has been proposed here but was discarded because of this argument.

thomcc commented 10 months ago

Why would it be a extension trait and not just a new method on Cow?

cuviper commented 10 months ago

Also discussed here: https://internals.rust-lang.org/t/add-a-shallow-clone-method-to-cow-reopen-33777/19728

getreu commented 10 months ago

@thomcc I suggest a new method on Cow. In doubt, a developer can then just try out shallow_clone() first and opt for clone() in case the lifetime rules do not allow the former.

cuviper commented 10 months ago

[bike-shedding]: I think reborrow() or something else with "borrow" or "ref" in the name would be more appropriate, since this does not create an independent value.

Or as pointed out in the forum, you can already write that as Cow::Borrowed(&**self).

getreu commented 10 months ago

since this does not create an independent value

This is also true for clone() in case of borrowed content.

The reason why I suggest ...clone() in the name, is because of the use case: I have a &Cow, but I need a Cow, what do I do? I look out for something like clone().

cuviper commented 10 months ago

since this does not create an independent value

This is also true for clone() in case of borrowed content.

No - fn clone(&self) -> Self has no borrowed relationship. That is, even though Self = Cow<'b, str> has some lifetime involved, it is not connected to the &self that you are cloning from.

scottmcm commented 10 months ago

I have a &Cow, but I need a Cow, what do I do? I look out for something like clone().

That parallel I'd draw here is to "I have a &Option but I need an Option", where I don't necessarily want clone since I might actually be looking for as_ref.

.borrow().into() will often work for this today, I think?

getreu commented 10 months ago

@scottmcm

I have a &Option but I need an Option

No, this is not what as_ref() does, so you can not compare the semantics:

let a: Option<String> = Some("world!".to_string());
let b: Option<&String> = a.as_ref();

whereas

 let a: &Cow<str> = &Cow::Owned("world!".to_string());
 let b: Cow<str>  = a.shallow_clone();

..._clone() really expresses: I have a &Cow and I get a Cow

getreu commented 10 months ago

BTW: in all examples above you can replace .shallow_clone() with .clone() and get the same result with the difference, that .clone() might allocate memory whereas .shallow_clone() never does.

ryanavella commented 10 months ago

.borrow().into() will often work for this today, I think?

For concrete types like Cow<str> a type annotation is required on the intermediate value, but .deref().into() should often work for this.

jmillikin commented 10 months ago

Cow<T>::into() is only implemented for certain types of T -- this code won't compile:

// error[E0277]: the trait bound `Cow<'_, T>: From<&T>` is not satisfied
fn reborrow<'a, T: 'a + ToOwned + ?Sized>(v: &'a Cow<'a, T>) -> Cow<'a, T> {
    v.as_ref().into()
}

What does work is an explicit wrapping with Cow::Borrowed:

fn reborrow<'a, T: 'a + ToOwned + ?Sized>(v: &'a Cow<'a, T>) -> Cow<'a, T> {
    Cow::Borrowed(v.as_ref())
}

which might be expressed as a method to get the same properties as the original ACP, without the risk of naming confusion with an actual clone:

impl<'a, T> Cow<'a, T> {
    fn as_borrowed(&'a self) -> Cow<'a, T> {
        Self::Borrowed(self.as_ref())
    }
}
 let a: &Cow<str> = &Cow::Owned("world!".to_string());
 let b: Cow<str>  = a.as_borrowed();
 assert_eq!(do_something_or_nothing(b), "Hello world!");
the8472 commented 10 months ago

We discussed this during last week's libs-api meeting. There wasn't a strong consensus but the general impression was that this is somewhat niche, can be achieved through existing methods and that the name isn't great either. An inherent as_ref would be better as a name but that's likely to cause conflicts with existing AsRef uses.

Our proposal is to update the documentation to point users to Cow::Borrowed(cow.deref()) or similar patterns when they need shallow clones.

dtolnay commented 10 months ago

Today Cow has inherent methods to get B::Owned and &mut B::Owned, but no inherent method to get &B. There is .as_ref() (you deal with inference ambiguities to infer the AsRef type parameter) and .deref() (you need an import). An inherent method to easily get &B would be nice and if a caller wants to wrap that in Cow::Borrowed(...), it’s easy.

Sadly .as_ref() is probably not viable as an inherent method on Cow due to breakage of existing downstream impl AsRef<MyType> for Cow<T> impls, but I am open to giving this a try in crater if someone can put up a PR.

Is there an alternative method name that might work? With this signature:

impl<'a, B> Cow<'a, B>
where
    B: 'a + ToOwned + ?Sized,
{
    pub fn ???(&self) -> &B;
}

Then shallow clone is expressible as Cow::Borrowed(cow.???()) which is adequate, I think.

Per https://github.com/rust-lang/libs-team/issues/283#issuecomment-1787089357 the standard library team is not interested in adding shallow_clone so I'll close. Thank you!