rust-lang / libs-team

The home of the library team
Apache License 2.0
110 stars 18 forks source link

ACP: Add {Box, Rc, Arc}::map and {Box, Rc, Arc}::try_map #364

Open orlp opened 3 months ago

orlp commented 3 months ago

Proposal

Problem statement

Suppose you a value contained in a Box, Rc, or Arc. It is quite natural to want to map this value to a different value, but still wanting the result to stay in a Box, Rc or Arc. Currently the easiest and most natural way of doing this always involves an allocation, whether this is necessary or not:

let old = Arc::new(42);
let new = Arc::new(old.unwrap_or_clone() * 2);

Motivating examples or use cases

Whenever you have both your input and output as a Box, Rc or Arc it is likely that you experience this problem. A personal example of mine was from a Domain Specific Language represented using Arcs. A simplified example is doing a recursive expression replacement of -x with 0 - x:

use std::sync::Arc;

#[derive(Clone)]
pub enum Expr {
    Add(Arc<Expr>, Arc<Expr>),
    Sub(Arc<Expr>, Arc<Expr>),
    Neg(Arc<Expr>),
    Number(i64),
}

fn repl_neg_with_sub(expr: Arc<Expr>) -> Arc<Expr> {
    use Expr::*;
    let ret = match Arc::unwrap_or_clone(expr) {
        Add(l, r) => Add(repl_neg_with_sub(l), repl_neg_with_sub(r)),
        Sub(l, r) => Sub(repl_neg_with_sub(l), repl_neg_with_sub(r)),
        Neg(x) => Sub(Arc::new(Number(0)), repl_neg_with_sub(x)),
        Number(x) => Number(x)
    };
    Arc::new(ret) // An unnecessary allocation most of the time.
}

Here we do an allocation for each node in the expression tree, when in a lot of cases the allocation could have been re-used. If I had used Boxes instead of Arcs every single allocation could have been re-used, only adding an allocation for the constant Number(0) added to the expression tree whenever Neg is substituted.

Solution sketch

I propose we add the following functions to Box, Rc and Arc:

type ChangeOutputType<T, V> = <<T as Try>::Residual as Residual<V>>::TryType;

impl<T, A> Arc<T, A>
where
    T: Clone,
    A: Allocator + Clone,
{
    fn map<U, F>(
        mut this: Self,
        f: F,
    ) -> Arc<U, A>
    where
        F: FnOnce(T) -> U,
    {
        // Inefficient example implementation showing expected behavior.
        let old = Arc::unwrap_or_clone(this);
        Arc::new(f(old))
    }

    fn try_map<F, R>(
        mut this: Self,
        f: F,
    ) -> ChangeOutputType<R, Arc<R::Output, A>>
    where
        F: FnOnce(T) -> R,
        R: Try,
        R::Residual: Residual<Arc<R::Output, A>>,
    {
        // Inefficient example implementation showing expected behavior.
        let old = Arc::unwrap_or_clone(this);
        let new = Arc::new(f(old)?);
        ChangeOutputType::<R, Arc<R::Output, A>>::from_output(new)
    }
}

The definitions for Rc are identical, the definitions for Box would also be the same except without a T: Clone bound.

While the shown example implementations work, they still would cause needless allocations. However in the standard library we can use unsafe code internally, and re-use the allocation (after {Rc, Arc}::make_mut has been called) with a transmute whenever T and U (or T and R::Output for try_map) have the same size and align.

With Arc::map the above example simplifies to:

fn repl_neg_with_sub(expr: Arc<Expr>) -> Arc<Expr> {
    use Expr::*;
    expr.map(|inner| match inner {
        Add(l, r) => Add(repl_neg_with_sub(l), repl_neg_with_sub(r)),
        Sub(l, r) => Sub(repl_neg_with_sub(l), repl_neg_with_sub(r)),
        Neg(x) => Sub(Arc::new(Number(0)), repl_neg_with_sub(x)),
        Number(x) => Number(x)
    })
}

Alternatives

  1. You can simply keep using extra allocations through Arc::new and such.
  2. You can write the unsafe code yourself (don't mess up the case when f panics!).
  3. You can use the replace_with crate, if your T has Default and don't mind a tiny bit of overhead from Default being temporarily placed inside the container.
  4. You can use the replace_with crate, if you don't mind an otherwise recoverable panic be turned into an abort.
  5. You can use the map_box crate if you only care about Box and don't need the Try version.

While the latter case (map_box) could be extended to cover the other cases, I feel such a basic and core functionality should be in the standard library instead of a third-party crate.

Links and related work

https://docs.rs/replace_with/latest/replace_with https://docs.rs/map_box/ https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/.7BBox.2C.20Rc.2C.20Arc.7D.3A.3Amap.20and.20.7BBox.2C.20Rc.2C.20Arc.7D.3A.3Atry_map

What happens now?

This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.

Possible responses

The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):

Second, if there's a concrete solution:

kennytm commented 3 months ago

I think Box::map is good but I'm not sure about Arc::map which requires a Clone bound of the inner type.

An alternative is making Arc::map take a FnOnce(&T) -> U rather than FnOnce(T) -> U.