Byron / gitoxide

An idiomatic, lean, fast & safe pure Rust implementation of Git
Apache License 2.0
8.84k stars 301 forks source link

Improve path join test assertion messages and fix minor test bugs #1528

Closed EliahKagan closed 1 month ago

EliahKagan commented 1 month ago

This fixes inaccurate assertion messages in the path-joining tests, improves some messages that were okay but capable of being clarified, and renames a couple of variables to avoid making it seem like relative paths are absolute. Most of the changes are to the Windows tests. This relates to https://github.com/Byron/gitoxide/pull/1374#pullrequestreview-2244385346.

It also corrects a few tests where the same thing was asserted multiple times with a different message rather than asserting the subtly different claim that the message showed was intended, or where it tested paths with a different number of backslashes than the context and message show were meant.

Finally, though this is only a refactoring, it changes all string literals with backslash-containing paths to be raw string literals. I think this makes it easier and also faster to understand what paths are being tested, and generally improves readability a bit.

(Sometimes it makes sense to write all Windows paths as raw string literals, even those without backslashes, but I didn't do that, because in this case I think it would produce an effect that, one way or another, would be confusing or at least feel strange, when comparing corresponding tests in the Windows and non-Windows build configurations.)

Several assertion messages had contained expressions of surprise at the effects. When rewording made this clear, I removed such wording. Elsewhere I kept it. In one place, I even added it: I do not know why concatenating \ and \\localhost on Windows produces \localhost, and it seems like the intent had even been to assert that it produces \\localhost (because it characterized the result as being UNC). But it does produce \localhost for some reason. Hopefully the message there can be improved further, but unless more is known about that now, I think my characerization as "strangely, single-backslashed host" may still be an improvement.

Some more details are available in commit messages and, in what may be an easier form to work with, in review comments I've posted below on particular changes.

EliahKagan commented 1 month ago

Thanks a lot for digging into this again and for the clarifications.

I had not mentioned why I did not also add a clarifying comment for this (which was present before, not introduced here, but which I had considered clarifying here):

https://github.com/Byron/gitoxide/blob/c3f173c94968907ecc685c5383f2fe24c02dcfbf/gix-fs/tests/stack/mod.rs#L1

I had originally planned to comment that this was a workaround for https://github.com/rust-lang/rust-clippy/issues/12244. This would also add another source of information about how paths that start with / or \ on Windows are not really absolute, except when they start with \\. Therefore I didn't further clarify that in any other ways, except through small adjustments to the custom assertion messages.

I didn't include this because the clippy::join_absolute_paths suppression does not have an affect on that, though it might once have, in an earlier draft. So it would be wrong, or at least misleading, to say it is a workaround for that clippy issue.

In the tests, the leading-/ and leading-\ paths on Windows are placed in variables:

https://github.com/Byron/gitoxide/blob/c3f173c94968907ecc685c5383f2fe24c02dcfbf/gix-fs/tests/stack/mod.rs#L38

https://github.com/Byron/gitoxide/blob/c3f173c94968907ecc685c5383f2fe24c02dcfbf/gix-fs/tests/stack/mod.rs#L43

These variables were originally named absolute and bs_absolute, which was misleading since the paths are not absolute on Windows, and I wonder if the reason for putting them in variables had been that clippy diagnostic. But with the new variable names looks_absolute and bs_looks_absolute, I think it is clearer even than the use of literal strings for them.

I think that diagnostic is only for literals. Commenting out the suppression on Windows does not seem to give any new cargo clippy warnings. So at least with clippy 0.1.80, this seems to have an affect only on non-Windows systems, since in the non-Windows tests, some such literals are used (for comparison to the Windows tests).

Because the tests are supposed to speak for themselves, I am reluctant to add a big commented explanation about these issues to the file. But if I can think of something that would improve clarity without making the code more cumbersome to read or work with, then I'll open another PR.

Indeed, these tests only exists to 'test' the std::path::Path implementation and get a feeling what happens in varying scenarios. Hopefully having written these down gives enough confidence that joining two relative paths won't accidentally create an absolute path, which could certainly be exploited.

Some relative paths would of course also be exploitable, such as if permitted paths could produce a relative path with a .. component in it, or a more dotted component like ... in it in a scenario where such a component performs upward traversal, or a relative path like C: or C:foo that specifies a drive, or a relative path like / or /foo or \ or \foo that is from the root of a drive, or a path that contains a symlink component that is used in a way that traverses symlinks, or almost any path coupled with the ability to create new hard links to existing files or (on Windows) other non-symlink reparse points such as junctions. I think we cover all that though, except the latter which we need not cover in paths because we never provide that ability.

For paths like .. as a tree with that name, #1374 covers that explicitly. For paths like ... I think we do not ever use them in a way that treats that specially on any operating system; they may also be directly blocked in some situations. For paths like C: and C:foo, I believe we are disallowing them since #1374 in the same way we disallow NTFS streams (alternate data streams, NTFS directory streams, etc.), which also use a : notation. For paths like / and /foo represented in the expected way as trees, we were already blocking that before CVE-2024-35186 was fixed in #1374. For paths like /, /foo, as well as \ and \foo on Windows, represented as single filenames, as well as x/../y and all its variations represented as single filenames, this was another aspect of CVE-2024-35186 that #1374 targeted explicitly.

With all that said, on Windows, you can join two relative paths to produce an absolute path, because C:, /, and \ are all relative paths on Windows, but C:/ and C:\ are absolute paths. Consider this program:

use std::path::Path;

fn main() {
    let drive = Path::new("C:");
    let slash = Path::new("/");
    let backslash = Path::new(r"\");
    let drive_slash = drive.join(slash);
    let drive_backslash = drive.join(backslash);

    for path in [drive, slash, backslash, drive_slash.as_path(), drive_backslash.as_path()] {
        let path_str = path.to_str().expect("valid Unicode");
        println!("{:3}  relative? {:5}  absolute? {}", path_str, path.is_relative(), path.is_absolute());
    }
}

On Windows, that outputs:

C:   relative? true   absolute? false
/    relative? true   absolute? false
\    relative? true   absolute? false
C:/  relative? false  absolute? true 
C:\  relative? false  absolute? true

Conceptually (and informally), what is happening is that C: is relative because only the drive it specifies is absolute, and / and \ are relative on Windows because only the drive-relative location they specify is absolute. So C:/ (or C:\) is absolute because the relative piece in each of C: and / (or \) has been resolved using corresponding absolute information from the other.

But because we are not allowing paths like C:, and not allowing paths like / or \, I think we are still okay.

Byron commented 1 month ago

Thanks for sharing! Paths are…fascinating :).

Because the tests are supposed to speak for themselves, I am reluctant to add a big commented explanation about these issues to the file. But if I can think of something that would improve clarity without making the code more cumbersome to read or work with, then I'll open another PR.

Since these tests are merely for documenting current behaviour, I think making it more obvious what they are testing (or what one should or could learn) is in their interest. Please feel free to open a PR with additional remarks, it's valuable to write it down while it's in active memory.