rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
95.1k stars 12.27k forks source link

Tracking issue for Path Prefix Remapping #41555

Closed michaelwoerister closed 6 years ago

michaelwoerister commented 7 years ago

This feature was originally requested in https://github.com/rust-lang/rust/issues/38322. The PR that implements this in its unstable form is https://github.com/rust-lang/rust/pull/41508. This will eventually become stable if the testing phase goes well.

brson commented 7 years ago

What's the next step to stabilize this? Looks like it probably won't be stable for 1.20?

michaelwoerister commented 7 years ago

Good question. I did not heard any complaints about the way things currently work -- which I interpret as everybody being happy :).

Some of the people initially interested in the feature were @jmesmon, @infinity0, @sanxiyn, and @jsgf. Do you have any feedback?

Other than that, two open topics are:

I personally would be fine with either of the first two options.

cc @rust-lang/dev-tools

infinity0 commented 7 years ago

I haven't tested this yet but I'm still interested. When I get a chance, I'll test it with 1.19 in Debian and see if we get a reproducible build.

oli-obk commented 7 years ago

We use it in clippy for making our tests' stderr output OS independent. Works like a charm: https://github.com/Manishearth/rust-clippy/blob/master/tests/examples.rs#L29

infinity0 commented 7 years ago

Is the duplication of the path with / and \ really necessary? I thought the stripping algorithm worked on whole-path components, so that trailing directory separators shouldn't be necessary.

oli-obk commented 7 years ago

Maybe it changed. But I had to implement it this way, otherwise it wouldn't work

michaelwoerister commented 7 years ago

I thought the stripping algorithm worked on whole-path components, so that trailing directory separators shouldn't be necessary.

The current implementation works on plain strings. It does not try to interpret them as paths.

infinity0 commented 7 years ago

Oh right. I was going by the code at the bottom of this older comment, I didn't notice you had gone back to plain string matching. What was the reason? I know GCC also do plain string matching, but I actually prefer path-level matching because it would be easier to use. This cross-platform duplication issue is a further example in that area.

jsgf commented 7 years ago

@brson, @michaelwoerister I just hacked it up in buck and it all looks good. Once it's stable (at least, we have decided on a stable command line option), I can do a real diff.

Either -Cremap-path-prefix or --remap-path-prefix is fine by me.

jsgf commented 7 years ago

@michaelwoerister Do you think it's possible to stabilize this in the 1.21 timeframe?

michaelwoerister commented 7 years ago

@infinity0 I'm divided on whether path matching should be string- or component-based. Component-based is a bit more ergonomic in the cross-platform case. I'm not sure though if it can be somewhat less predictable in corner cases (e.g. strange Windows path prefixes).

@jsgf That should be doable.

I'm nominating this for the next @rust-lang/dev-tools meeting to discuss string matching and which form the CLI should take -- and then we can let the appropriate subteams vote on it.

@rust-lang/core: Which teams need to sign off on stabilizing this? Quick recap: This feature allows to remap file paths (in output messages and output artifacts) to be remapped. This is necessary for reproducible builds and helps with custom debugging setups.

alexcrichton commented 7 years ago

@michaelwoerister @rust-lang/dev-tools sounds good to me as a team for signing off!

codyps commented 7 years ago

On matching mechanism: an alternate to paths & strings is to use regexes. Strings works fine for me though.

michaelwoerister commented 6 years ago

This has been discussed in the @rust-lang/dev-tools meeting and the conclusion was that we want to stabilize in the following form:

@rfcbot fcp merge

rfcbot commented 6 years ago

Team member @michaelwoerister has proposed to merge this. The next step is review by the rest of the tagged teams:

No concerns currently listed.

Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

sophiajt commented 6 years ago

@rfcbot reviewed

infinity0 commented 6 years ago

We change parameter passing to the GCC way: the option takes a single string argument of the form from=to. This means that the new prefix (the to part) cannot contain an equals sign, but otherwise this has the advantage of being shorter than the current unstable form and is close to prior art.

So in other words you would split on the final =, right? I prefer that to the current GCC behaviour, which splits on the initial = so that the from part cannot contain an equals sign.

michaelwoerister commented 6 years ago

@infinity0 Yes, at least that's how I imagined it. The "to" part is more likely under the user's control where they can avoid = characters.

killercup commented 6 years ago

@rfcbot reviewed

jsgf commented 6 years ago

Yay!

this has the advantage of being shorter than the current unstable form

I don't see why that's a useful advantage, given that these options will almost certainly be generated by some build system.

The split from/to argument form did always seem pretty awkward to me, but preferable to either adding arbitrary restrictions on what characters can be in paths, or introducing a quoting system.

That said, I think splitting on last = will work fine for now, though I'll double-check that = isn't a problem for us. If it ends up being a problem in the future, then we may end up with an option like --remap-path-prefix-separator to allow = to be overridden.

(Edit: no problem - we always map to to "")

japaric commented 6 years ago

@rfcbot reviewed

rfcbot commented 6 years ago

:bell: This is now entering its final comment period, as per the review above. :bell:

johnthagen commented 6 years ago

I was hoping this would help solve the privacy issues brought up in #40552/#40374/https://github.com/rust-lang/rfcs/issues/1417, but after trying it out, am I correct that this isn't designed to solve that problem?

I checked out a sample rust project, https://github.com/johnthagen/rust-belt, and tried to manually change the prefix on the strings, but that resulted in a compile failure.

$ rustc +nightly --version
rustc 1.22.0-nightly (368122087 2017-09-06)

$ RUSTFLAGS="-Zremap-path-prefix-from=/Users -Zremap-path-prefix-to=/Hello" cargo +nightly build
   Compiling piston-texture v0.5.0
   Compiling byteorder v1.0.0
   Compiling either v1.1.0
   Compiling core-foundation-sys v0.4.1
   Compiling odds v0.2.25
   Compiling lzw v0.10.0
   Compiling current v0.1.2
   Compiling unicode-xid v0.0.4
error[E0583]: file not found for module `tables`
  --> /Hello/hagen/.cargo/registry/src/github.com-1ecc6299db9ec823/unicode-xid-0.0.4/src/lib.rs:57:5
   |
57 | mod tables;
   |     ^^^^^^
   |
   = help: name the file either tables.rs or tables/mod.rs inside the directory "/Hello/hagen/.cargo/registry/src/github.com-1ecc6299db9ec823/unicode-xid-0.0.4/src"

error: aborting due to previous error

Would anyone be able to share the intended use case/setup/example for this feature is? Is it more packaging-related than privacy?

Doc link for feature

jsgf commented 6 years ago

Hm, that may be a bug - remapping should only affect paths as they're output by the compiler (either in messages, debug info, or in filename macros). But on the other hand, it could genuinely have been missing, and the compiler is showing us the remapped path, not the path it actually tried to use. Without seeing the actual path it passed to the syscall, its a bit hard to tell.

After having a closer look, I'm wondering if tables is being generated by a build script, and the path remapping is confusing where the generated source is actually put, or where it should be read from.

@michaelwoerister is probably the best person to diagnose this one way or the other.

michaelwoerister commented 6 years ago

Yes, this feature is supposed to help with the privacy issues mentioned.

As @jsgf says, this is either a bug in the implementation or you're running into a cumbersome corner case here. But it looks more like a bug to me.

michaelwoerister commented 6 years ago

Yes, it's a bug. The parser uses the remapped path when it tries to locate the tables.rs file on disk. I'll try to find some time to fix this next week. However, if someone wants to give fixing this a try, or implementing the stable version of the CLI, I'd be happy to mentor.

johnthagen commented 6 years ago

Here is another simplified example, this time on Windows.

Environment variable set: RUSTFLAGS=-Zremap-path-prefix-from=C:\ -Zremap-path-prefix-to=Z:\

>rustc +nightly --version
rustc 1.22.0-nightly (d93036a04 2017-09-07)
# Cargo.toml
[package]
name = "rust-test"
version = "0.1.0"

[dependencies]
reqwest = "0.7.3"
// main.rs
extern crate reqwest;

fn main() {
    let x = reqwest::get("https://google.com");
}
C:/Users/User/.cargo/bin/cargo.exe +nightly run --release
   Compiling safemem v0.2.0
   Compiling cfg-if v0.1.2
   Compiling matches v0.1.6
   Compiling libc v0.2.30
error[E0583]: file not found for module `macros`
  --> Z:\Users\User\.cargo\registry\src\github.com-1ecc6299db9ec823\libc-0.2.30\src\lib.rs:97:18
   |
97 | #[macro_use] mod macros;
   |                  ^^^^^^
   |
   = help: name the file either macros.rs or macros\mod.rs inside the directory "Z:\\Users\\User\\.cargo\\registry\\src\\github.com-1ecc6299db9ec823\\libc-0.2.30\\src"

error: aborting due to previous error

   Compiling language-tags v0.2.2
error: Could not compile `libc`.
warning: build failed, waiting for other jobs to finish...
error: build failed

Process finished with exit code 101
johnthagen commented 6 years ago

And here is another test case, using the same simple reqwest project as in the previous comment, but building on macOS.

$ RUSTFLAGS="-Zremap-path-prefix-from=/Users -Zremap-path-prefix-to=/Hello" cargo +nightly build
   Compiling dtoa v0.4.2
   Compiling adler32 v1.0.2
   Compiling scoped-tls v0.1.0
   Compiling semver v0.1.20
   Compiling matches v0.1.6
   Compiling futures v0.1.15
   Compiling httparse v1.2.3
   Compiling num-traits v0.1.40
error[E0583]: file not found for module `diyfp`
  --> /Hello/hagen/.cargo/registry/src/github.com-1ecc6299db9ec823/dtoa-0.4.2/src/lib.rs:11:18
   |
11 | #[macro_use] mod diyfp;
   |                  ^^^^^
error[E0583]: file not found for module `version`
   --> /Hello/hagen/.cargo/registry/src/github.com-1ecc6299db9ec823/semver-0.1.20/src/lib.rs:173:5
    |
173 | mod version;
    |     ^^^^^^^
   |
    |
   = help: name the file either diyfp.rs or diyfp/mod.rs inside the directory "/Hello/hagen/.cargo/registry/src/github.com-1ecc6299db9ec823/dtoa-0.4.2/src"
    = help: name the file either version.rs or version/mod.rs inside the directory "/Hello/hagen/.cargo/registry/src/github.com-1ecc6299db9ec823/semver-0.1.20/src"

error[E0583]: file not found for module `identities`
  --> /Hello/hagen/.cargo/registry/src/github.com-1ecc6299db9ec823/num-traits-0.1.40/src/lib.rs:32:9
   |
32 | pub mod identities;
   |         ^^^^^^^^^^
error: aborting due to previous error
   |

error: aborting due to previous error

   = help: name the file either identities.rs or identities/mod.rs inside the directory "/Hello/hagen/.cargo/registry/src/github.com-1ecc6299db9ec823/num-traits-0.1.40/src"

error: aborting due to previous error

error[E0583]: file not found for module `poll`
   --> /Hello/hagen/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-0.1.15/src/lib.rs:173:5
    |
173 | mod poll;
    |     ^^^^
    |
    = help: name the file either poll.rs or poll/mod.rs inside the directory "/Hello/hagen/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-0.1.15/src"

error: aborting due to previous error

error: Could not compile `dtoa`.
warning: build failed, waiting for other jobs to finish...
error: Could not compile `semver`.
warning: build failed, waiting for other jobs to finish...
error: Could not compile `num-traits`.
warning: build failed, waiting for other jobs to finish...
error: Could not compile `futures`.
warning: build failed, waiting for other jobs to finish...
error[E0583]: file not found for module `iter`
  --> /Hello/hagen/.cargo/registry/src/github.com-1ecc6299db9ec823/httparse-1.2.3/src/lib.rs:20:5
   |
20 | mod iter;
   |     ^^^^
   |
   = help: name the file either iter.rs or iter/mod.rs inside the directory "/Hello/hagen/.cargo/registry/src/github.com-1ecc6299db9ec823/httparse-1.2.3/src"

error: aborting due to previous error

error: Could not compile `httparse`.
warning: build failed, waiting for other jobs to finish...
error: build failed
rfcbot commented 6 years ago

The final comment period is now complete.

jsgf commented 6 years ago

Does this need release notes? Is this where that gets tagged?

johnthagen commented 6 years ago

I can confirm that #44940 fixed the compile issues I was seeing when using this feature and a simple reqwest application on macOS and Windows.

Thoughts on privacy

I think this feature possesses the mechanics to more broadly address privacy issues for closed-source code. This may be out of scope for the current issue, but I wanted to bring it up now since this feature hasn't been fully implemented yet.

The current reality for closed source code is that any time a panic!() is used, the module name, line number, and column number will be leaked into the binary to support unwinding. This is true even if panic = 'abort' is specified. The root issue is that panic!() calls file!(). This could leak private, perhaps proprietary or sensitive, implementation details that the company may not want to be included in released binaries.

Example

Let's say a company creates a video game with some hidden content. If the source for that module included a panic!() (and note, even things like unreachable!() call panic!()), the module name could be leaked as a string:

src/hidden_dlc/extra_boss_fight/victors_betrayal.rs

Using this feature, the company now has a manual way to suppress this at compile time (cool!):

RUSTFLAGS=-Zremap-path-prefix-from=src/hidden_dlc/extra_boss_fight/victors_betrayal.rs -Zremap-path-prefix-to=

But note that this is very manual (need a -from/-to for each module), and you'd need to make sure you didn't forget modules.

Straw Man Proposal

If rustc let you suppress all uses of file!() at compile time, you could allow such companies to keep these sensitive implementation details out of the binaries. This also helps save on binary size if these strings are not needed.

RUSTFLAGS=-Zremap-path-prefix-from=* -Zremap-path-prefix-to=
luser commented 6 years ago

What's the current status of this? It looks like it's been approved for stabilization, but the plan is to change the option to a single --remap-path-prefix. Does that change just need to be made, and then this can ship on stable?

jsgf commented 6 years ago

I've been using the -Zremap-path-prefix- form in our build environment (I failed to guess when they'd stabilize) for a few months now, and it's been working well.

michaelwoerister commented 6 years ago

Yes, it's just waiting to be implemented. I've just been busy with other things. I'm glad to take PRs though.

jsgf commented 6 years ago

OK, I'm looking at it now.

kpcyrd commented 6 years ago

As a minor note, it would be interesting to have both $HOME and $PWD remapped to standard values for release builds. This is currently the only thing that I have to adjust manually for reproducible builds (besides #47135).

jsgf commented 6 years ago

They would only affect the build if you're explicitly referencing them with env!("HOME"), right? You could just do --remap-path-prefix=$PWD=./.

kpcyrd commented 6 years ago

@jsgf the binary includes my $CARGO_HOME into the binary by default, if I don't have that variable set it defaults to $HOME/.cargo. :)

I think my project root and my explicit or implicit cargo home should be remapped.

jsgf commented 6 years ago

@kpcyrd Well I guess that's an issue to be addressed in cargo.

kpcyrd commented 6 years ago

@jsgf you're right, I've opened an issue for that: https://github.com/rust-lang/cargo/issues/5505

XAMPPRocky commented 6 years ago

Closing as this was stabilised in 1.26.0 and there doesn't seem to be any leftover features to be implemented.