chipsenkbeil / typed-path

Provides typed variants of Path and PathBuf for Unix and Windows
38 stars 6 forks source link

Converting User-Created Paths is a Hazard #22

Closed nathaniel-daniel closed 7 months ago

nathaniel-daniel commented 7 months ago

This is related to #19, but this focuses more on usage within the library itself instead of from the outside. See https://github.com/rust-lang/rust/issues/59726 for more info.

Essentially, pushing a non-relative path to a path can cause the path to be replaced. This can be hazardous when a user controls the input path, like a webserver.

Here's a simple example:

use typed_path::{Utf8Path, Utf8UnixEncoding, Utf8WindowsEncoding};

fn main() {
    let windows_path = Utf8Path::<Utf8UnixEncoding>::new("/tmp/d:/foo");
    dbg!(windows_path);
    let unix_path = windows_path.with_encoding::<Utf8WindowsEncoding>();
    dbg!(unix_path);
}

Outputs:

[src\main.rs:5] windows_path = Utf8Path {
    _encoding: "unix",
    inner: "/tmp/d:/foo",
}
[src\main.rs:7] unix_path = Utf8PathBuf {
    _encoding: "windows",
    inner: "d:foo",
}

Converting between path formats should not use join to build paths, or at least return some kind of error instead of overwriting the prior path.

chipsenkbeil commented 7 months ago

Hm, it sounds like the discussion in the issue for Rust was suggesting some sort of join_safely method. Is the issue that push can both overwrite the path when given an absolute path or contain a prefix like a drive letter? If you can help me list out the cases, that will make implementing a fix faster.

My first thought is to support a strict series of methods to push paths, join paths, and instantiate paths. Each of these can fail and thereby return a result. Thoughts?

use typed_path::{Utf8Path, Utf8UnixEncoding, Utf8WindowsEncoding};

fn main() {
    // If the path itself contains something invalid ??
    let windows_path = Utf8Path::<Utf8UnixEncoding>::new_strict("/tmp/d:/foo").unwrap();
    dbg!(windows_path);

    // If the conversion would lead to pushing absolute paths, this would fail
    let unix_path = windows_path.with_strict_encoding::<Utf8WindowsEncoding>().unwrap();
    dbg!(unix_path);

    // Imagine there would also be these methods
    // unix_path.push_strict("...").unwrap();
    // unix_path.join_strict("...").unwrap();
}
nathaniel-daniel commented 7 months ago

Hm, it sounds like the discussion in the issue for Rust was suggesting some sort of join_safely method. Is the issue that push can both overwrite the path when given an absolute path or contain a prefix like a drive letter? If you can help me list out the cases, that will make implementing a fix faster.

19 looks like its covering the more general case of this issue. Yes, the issue is with an absolute or prefix path overwriting the entire path. In many contexts it is a security hazard, like pushing url components to a path in a static file server or creating an output path in a file archive extractor. See https://github.com/tower-rs/tower-http/pull/204 for a good example of the kinds of issues it can cause.

However, I opened this issue to cover, specifically, the conversions between Unix and Windows path types with the with_encoding function. This is because the with_encoding method uses join internally and suffers from the same issues as join. Even if a fix like #20 lands, I don't think this will fix the with_encoding method.

For my use case, I am interested in converting an untrusted Unix-style path into a Windows-style path. I want a with_encoding like function that returns an error if the conversion could not be performed while maintaining the path semantics.

My first thought is to support a strict series of methods to push paths, join paths, and instantiate paths. Each of these can fail and thereby return a result. Thoughts?

I think that is a good solution for the general case. I think the wrapper approach in #20 is overkill. I think this will also fix my use case as well.

// If the path itself contains something invalid ?? let windows_path = Utf8Path::::new_strict("/tmp/d:/foo").unwrap();

The issue is a slightly deeper than that. ":" is valid within file names on Unix, but not on Windows; the above path is a valid Unix path. On Windows, ":" is usually used for drive prefixes. This means converting this path will completely change its meaning, with no way to properly round-trip the path. As a side note, since I haven't tested, how does this crate handle the other banned Windows file name chars?

I'm not sure if its a good idea to keep with_encoding though, since it seems like a pretty big footgun to pretend that you can round-trip paths, and allows users to easily create a path that overwrites its prior components which can be a hazard if the path source is untrusted. Then again, it also already silently strips prefixes during conversions.

chipsenkbeil commented 7 months ago

I think that is a good solution for the general case. I think the wrapper approach in https://github.com/chipsenkbeil/typed-path/pull/20 is overkill. I think this will also fix my use case as well.

I'll take a stab at implementing this soon. Possibly today or this weekend if I have time. Will share the PR and let you take a look to see if it addresses what you'd expect.

I'm not sure if its a good idea to keep with_encoding though, since it seems like a pretty big footgun to pretend that you can round-trip paths, and allows users to easily create a path that overwrites its prior components which can be a hazard if the path source is untrusted. Then again, it also already silently strips prefixes during conversions.

For my personal use case, I want a conversion that does not require unwrapping or handling an error, but I can see why that may not be desired by many. As part of the above, I can offer a with_strict_encoding that will try to convert and fail if it produces an invalid path, if I'm able to detect it.

Alternatively, given this library is still being developed and not at a 1.0 release, I could rename the pre-existing encoding conversions functions to something like with_unchecked_encoding to make it more clear that those functions do nothing more than blindly convert between path formats.

chipsenkbeil commented 7 months ago

@nathaniel-daniel can you check if version 0.8.0 solves your needs for encoding change protection? Leave a comment here confirming or reopen this task and we can figure out next steps.

nathaniel-daniel commented 7 months ago

Thanks, that works from my tests. I'll open a new issue if I find any more problems.