rust-lang / libs-team

The home of the library team
Apache License 2.0
127 stars 19 forks source link

ACP: Method for joining multiple paths in one go #209

Closed GoldsteinE closed 1 year ago

GoldsteinE commented 1 year ago

Proposal

Problem statement

There’s a popular pattern for working with Rust paths: doing path.join(first).join(second) to create a PathBuf from multiple components. Unfortunately, that could result in extra unneeded allocations (which may or may not be optimized out).

More optimal version looks like this:

let mut path_buf = path.to_path_buf();
path_buf.push(first);
path_buf.push(second);

which is not pretty.

Motivation, use-cases

Extremely common. Spotted in rust-lang/rust: https://github.com/rust-lang/rust/blob/edcbb295c9827ce38cbef4093e2c3d184923f362/src/bootstrap/doc.rs#L205 https://github.com/rust-lang/rust/blob/edcbb295c9827ce38cbef4093e2c3d184923f362/src/bootstrap/dist.rs#L2251

Deno: https://github.com/denoland/deno/blob/cb2ca234bb39d8e02b08d2866860e8d3a00b5887/core/extensions.rs#L647

Starship: https://github.com/starship/starship/blob/ce7f984932a97b4ad3cd6e6ece8e1c3b6022ba99/src/context.rs#L418

It’s easy to find more examples: https://github.com/search?q=language%3Arust%20%2F%5C.join%5C(.*%3F%5C)%5C.join%2F&type=code

I’m not sure that ever happens on a hot path though.

Notably, there is a one-line way to do it, but it’s hard to discover, so one possible alternative would be to document it better.

[path, first, second].into_iter().collect()

It also doesn’t handle the case where base path and other parts have different types well, which seems to be quite common.

Solution sketches

impl Path {
    // name bikesheddable
    pub fn join_multi<P: AsRef<Path>>(&self, paths: impl IntoIterator<Item = P>) -> PathBuf;
}

impl PathBuf {
    // name extremely bikesheddable
    pub fn with_multi<P: AsRef<Path>>(self, paths: impl IntoIterator<Item = P>) -> Self;    
}

Used like

path.join_multi([first, second])

Another alternative design could be a builder-like interface to PathBuf:

impl PathBuf {
    fn with(self, path: impl AsRef<Path>) -> Self;
}

used like

path.to_path_buf().with(first).with(second)

or even

path.join(first).with(second)

It could be a bit more wordy, but has a benefit of not requiring first and second to be the same type.

Links and related work

Zig has a join() function for concatenating a list of paths: https://ziglang.org/documentation/0.6.0/std/#std;fs.path.join

I don’t think C++ has such a method.

This issue is much less relevant to GCed languages, so I don’t think it’s very useful to look at their APIs, but e.g. Python has os.path.join() / PurePath.joinpath() and Go has path.Join().

What happens now?

This issue is part of the libs-api team API change proposal process. Once this issue is filed the libs-api team will review open proposals in its weekly meeting. You should receive feedback within a week or two.

GoldsteinE commented 1 year ago

On a further reflection, I like a second design (builder-like) more than the first, because it has more obvious naming and actually could be shorter in the common case.

shepmaster commented 1 year ago

When I'm caring about this bit of performance, I usually use a macro:

macro_rules! build_from_paths {
    ($base:expr, $($segment:expr),+) => {{
        let mut base: ::std::path::PathBuf = $base.into();
        $(
            base.push($segment);
        )*
        base
    }}
}
build_from_paths!("a", "b", "c");
build_from_paths!(PathBuf::from("z"), OsStr::new("x"), Path::new("y"));
ChrisDenton commented 1 year ago
impl PathBuf {
    // name extremely bikesheddable
    pub fn with_multi<P: AsRef<Path>>(self, paths: impl IntoIterator<Item = P>) -> Self;    
}

Is this not extend? E.g.:

let mut p = PathBuf::from("/base");
p.extend(["a", "b", "c"]);
GoldsteinE commented 1 year ago

Still requires creating a separate mutable variable, but yeah, .extend() works quite well, actually.

ChrisDenton commented 1 year ago

Well, from_iter isn't bad for that case.

let p = PathBuf::from_iter(["/base", "a", "b", "c"]);

I do think we at least need better trait documentation here.

shepmaster commented 1 year ago

The annoyance with the slice / iterator route is that the types need to be homogenous. In my cases, I usually have a &Path or PathBuf as my starting point and then string literals mixed with variables. For example:

let config_dir: PathBuf = todo!("provided by command line option");
let database_name: String = todo!("provided by HTTP request");
let metadata: &str = "metadata.json";

let final_path = mash_up_all_three_somehow();

In that specific case, you could get a &str from the String, but that doesn't generalize to cases where I have a PathBuf / &Path / OsString / &OsStr from something else (iterating the directories on disk, perhaps).

ChrisDenton commented 1 year ago

Yeah, without a macro the best you could do is probably from_iter but with an explicit item type.

fn build_path<'a, I: IntoIterator<Item = &'a OsStr>>(iter: I) -> PathBuf {
    PathBuf::from_iter(iter)
}
let p = build_path([config_dir.as_ref(), database_name.as_ref(), metadata.as_ref()]);
ChrisDenton commented 1 year ago

So, to bring this back to the proposal, I can see the benefit of:

impl PathBuf {
    fn with(self, path: impl AsRef<Path>) -> Self;
}
path.to_path_buf().with(first).with(second)

Even if it's a bit wordy.

shepmaster commented 1 year ago

What I really wanted in my last example was to use +, akin to what you can do with String:

let final_path =  config_dir + database_name + metadata;
// maybe in other cases
let final_path =  config_dir.clone() + database_name + metadata;

I assume that this has been discussed before?

I think this has the benefit of helping people fall into the "pit of success", as I bet other people want to use + for this as well.

ChrisDenton commented 1 year ago

The closest I could find was discussion on implementing Add for OsString, see #2020, which was rejected. There was some scepticism about using + with String even.

GoldsteinE commented 1 year ago

Even if it's a bit wordy.

path.join(first).with(second)

has exactly the same length as two joins.

What I really wanted in my last example was to use +, akin to what you can do with String

Most languages that have “join paths” operator use / for it, although it would be kinda weird to implement Div on a Path. Operators are also kinda weird wrt ownership (which is why there’s no &str + &str, for example).

pitaj commented 1 year ago

What about impl From<(T...)> for PathBuf:

let p = PathBuf::from((first, second));
cuviper commented 1 year ago

We don't have variadic types, so you'd have to expand that tuple manually. (Which we do in some places...)

pitaj commented 1 year ago

Yeah that's what I meant to imply, I think going up to 12 like many other places would be fine.

joshtriplett commented 1 year ago

We talked about this in a libs meetup today. It sounds like there are several existing ways to do this: push, extend, from_iter, etc. There are also external crates like path_macro to make it even briefer. Given that, we'd rather not add this method.

Side note, if we ever make a new Path API, we should make push return &mut Self to allow chaining it.

tgross35 commented 1 year ago

I recently came across the need to join paths in a platform-compatible way, and have come to dislike the required intermediate binding - it's quite clunky.

Would a trait-based change to the existing join function be considered, rather than a separate new method? I did a quick demo here https://github.com/rust-lang/rust/pull/112496

pub fn join<P: JoinPath>(&self, path: P) -> PathBuf;

let base = Path::new("foo");
let bat = foo.join("bar");        // current use
let baz = foo.join(("bar", "baz"));

(wish implementing for both T and IntoIterator<Item=T> was possible without specialization...)