rust-lang / cargo

The Rust package manager
https://doc.rust-lang.org/cargo
Apache License 2.0
12.6k stars 2.39k forks source link

Access to compiler artifact notifications messages #13672

Open pacak opened 5 months ago

pacak commented 5 months ago

Problem

You can pass --json=artifacts to rustc and it will emit json messages like {"$message_type":"artifact","artifact":"/path/to/target/release/deps/libsample-fb5d74408737397e.rlib","emit":"link"} to stderr.

cargo then parses them and mostly ignores: https://github.com/rust-lang/cargo/blob/a59aba136aab5510c16b0750a36cbd9916f91796/src/cargo/core/compiler/mod.rs#L1803-L1810

later emitting its own version: https://github.com/rust-lang/cargo/blob/a59aba136aab5510c16b0750a36cbd9916f91796/src/cargo/core/compiler/mod.rs#L563

I'm extending --json=artifacts in rustc to include files produced by --emit (https://github.com/rust-lang/rust/pull/122597) to avoid doing fragile guesswork in cargo-show-asm, but right now this information won't be accessible when rustc is invoked via cargo.

Proposed Solution

cargo should collect this information when present and output it as part of its own notifications, including cases where rustc wasn't actually invoked if source files staid the same.

Notes

The alternative is for me to invoke rustc directly - fragile or to do nothing - currently I have to pass codegen-units=1 - potentially confusing behavior.

If approved - I can look into implementing required change myself.

pacak commented 5 months ago

FWIW https://github.com/rust-lang/rust/pull/122597 is implemented and approved, awaiting on FCP to how exactly to get it merged.

epage commented 5 months ago

cargo show-asm is calling cargo rustc, passing --emit to rustc to render this information to users. They want to get the json messages for these emitted files so they can look up directly, rather than inferring.

For me, the main question is "why was the filtering done in the first place?"

The code in question was changed in

pacak commented 4 months ago

FCP for my change was approved :tada:

Any ideas how to best implement it in cargo?

epage commented 3 months ago

We talked about this in today's cargo team meeting.

The biggest challenge we have with exposing these json messages is that they expose implementation details of cargo

We came up with three options but no real consensus on which route to go

  1. Add cargo rustc --emit=asm flag that passes the right flags to rustc, copies the artifacts into --out-dir, and then emits rewritten json messages with the new path
  2. Assume cargo rustc is "low level" already and pass along all messages and just say caveat emptor
  3. Link against cargo and make the relevant changes so you can do this yourself
pacak commented 3 months ago

The directory structure of target/ is mostly unspecified. I'm assuming people generally work around this today by mucking with the directory which already causes problems. The distinction would be that we would be formalizing and approving of it.

The way I see working with target/ directory is that you, as a developer want to access some info from there that can be produced by rustc or by cargo. https://github.com/rust-lang/rust/pull/122597 was merged, so starting from 1.90 if you ask rustc to produce anything with --emit - it will tell where that file is, along with the type. Cargo is doing something similar. Instead of formalizing target/ structure we can formalize this query mechanism and structure can remain unspecified.

  1. Add cargo rustc --emit=asm flag that passes the right flags to rustc, copies the artifacts into --out-dir, and then emits rewritten json messages with the new path.

It's asm, but also several other types - at least llvm-ir, mir and obj, so realistically this means parsing it the same way, deciding what to copy. I'm not sure why is the copying needed though. At least for asm/llvm/obj file will remain in the same place if there's no changes to it and will be copied from incremental cache if needed.

  1. Assume cargo rustc is "low level" already and pass along all messages and just say caveat emptor

Possibly, this means cargo-metadata must be updated to handle all the messages.

  1. Link against cargo and make the relevant changes so you can do this yourself

I tried doing that. This works, but adds unnecessary coupling between cargo-show-asm and cargo so figuring out which version of rustc changed the output might require several version of cargo-show-asm as well. Plus linking cargo caused some ssh related linking issues on MacOS for users so I'd rather not go this route.

I think out of those options 2 is the easiest way forward if we wrap every rustc json message into { rustc: { /* original json message goes here */ }}.

epage commented 3 months ago

Oh, and one terrible workaround I forgot to mention is to explore using a rustc wrapper.

epage commented 3 months ago

Cargo is doing something similar. Instead of formalizing target/ structure we can formalize this query mechanism and structure can remain unspecified.

We'd be pointing to people to files in internal locations. We'd want to be directing people to files in official, public locations.

I'm not sure why is the copying needed though

To get it out of the private location

pacak commented 3 months ago

Oh, and one terrible workaround I forgot to mention is to explore using a rustc wrapper.

That's horrifying.

We'd want to be directing people to files in official, public locations. To get it out of the private location

Is it documented somewhere?

For anything that depends on a compilation unit - asm, object or llvm files, each file gets a suffix that depends on codegen unit somehow, making public locations for those means either documenting this suffix or inventing a new mapping structure.

bjorn3 commented 3 months ago

You may want to get artifacts for non-local dependencies of a crate. For those there is no official, public location to copy the artifacts too.

pacak commented 3 months ago

You may want to get artifacts for non-local dependencies of a crate. For those there is no official, public location to copy the artifacts too.

I guess that makes it harder to go route (1) but okay with (2).

epage commented 3 months ago

For future reference, we discussed this more in Office Hour last week and another potential route is to see if rustc is open to expanding their CLI for overriding where to emit to from taking a file path to also taking a directory root.