rust-lang / cargo

The Rust package manager
https://doc.rust-lang.org/cargo
Apache License 2.0
12.7k stars 2.41k forks source link

Proc macros can cause non-json output in stdout despite message-format=json #8179

Open roblabla opened 4 years ago

roblabla commented 4 years ago

Problem

When proc macros print to stdout, the output is reflected in the cargo output. For instance, the following proc macro will cause Hello to be printed in the compiler output, even when message-format=json is passed, even though it's not a valid JSON object. This will cause various tools (such as cargo-metadata, rust-analyzer, jq...) to fail to parse the message, and give up.

extern crate proc_macro;
use proc_macro::TokenStream;

#[proc_macro_attribute]
pub fn hello(_attr: TokenStream, item: TokenStream) -> TokenStream {
    println!("Hello");
    item
}

I'm not sure if this is considered a cargo or a rustc bug ^^'.

Steps

  1. Install jq
  2. git clone https://github.com/roblabla/cargo-proc-macro-print-bug
  3. cargo build --message-format=json | jq .

This will cause jq to error with "parse error: Invalid numeric literal at line 3, column 0", which is caused by the string "Hello", printed by the proc macro, to appear in stdout.

Possible Solution(s) I imagine the proc macro output could be:

Notes

cargo 1.42.0 (86334295e 2020-01-31) rustc 1.42.0 (b8cedc004 2020-03-09)

ehuss commented 4 years ago

Tools should only interpret a line as JSON if it starts with {. I thought that was documented, but it is not (as you know, we are working on the docs on the cargo_metadata side in https://github.com/oli-obk/cargo_metadata/pull/106).

Otherwise, there is nothing that can be done for now. Proc-macros run unrestricted within the rustc process, and can do whatever they want. There are experiments like watt for running proc-macros in an isolated environment, but I think that is very very far off, and may not be flexible enough in some cases.

roblabla commented 4 years ago

Tools should only interpret a line as JSON if it starts with {. I thought that was documented, but it is not.

In fact, documentation explicitly says otherwise:

When passing --message-format=json, Cargo will output the following information during the build:

  • compiler errors and warnings,
  • produced artifacts,
  • results of the build scripts (for example, native dependencies).

The output goes to stdout in the JSON object per line format. The reason field distinguishes different kinds of messages.

The way I understand this is that stdout will only contain JSON. It'd definitely be worth it to make the phrasing a lot less ambiguous here.


Wouldn't it be possible to redirect stdout to stderr when running proc-macros, and back when exiting it? I know it's possible on unix with dup2, and on windows with some mix of SetStdHandle and _open_osfhandle. It's not a great solution, but it'd be a pretty good stopgap :shrug:.

dnagir commented 9 months ago

I was very surprised by this behaviour and I read the documentation exactly as @roblabla suggested - the output will only contain JSON.

But, as I understand, this doesn't just apply to a macros. See a very simple repro of a brand new project printing non-JSON output:

$ cargo new msgfmt && cd msgfmt
$ cargo test --message-format=json-diagnostic-short 2> /dev/null
{"reason":"compiler-artifact","package_id":"msgfmt 0.1.0 (path+file:///playground/msgfmt)","manifest_path":"/playground/msgfmt/Cargo.toml","target":{"kind":["bin"],"crate_types":["bin"],"name":"msgfmt","src_path":"/playground/msgfmt/src/main.rs","edition":"2021","doc":true,"doctest":false,"test":true},"profile":{"opt_level":"0","debuginfo":2,"debug_assertions":true,"overflow_checks":true,"test":true},"features":[],"filenames":["/playground/msgfmt/target/debug/deps/msgfmt-c30df441d5f4cc37"],"executable":"/playground/msgfmt/target/debug/deps/msgfmt-c30df441d5f4cc37","fresh":true}
{"reason":"build-finished","success":true}

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

Not sure if there is an option to stop printing non-JSON but this makes it quite hard to integrate with scripts. Reliable output is very important for any tooling to work correctly.

This makes it hard to even use basic jq for processing the data as the test failures may appear in between the valid JSON entries.

ben0x539 commented 9 months ago

Cargo could capture output of subprocesses and manually strip non-json lines, yeah? I understand the cited docs only say "When passing --message-format=json, Cargo will output [...] in the JSON object per line format" and it doesn't say "Cargo may not output non-JSON things, or, through inaction, allow non-JSON things to be output", but even weakly advertising JSON output capabilities will induce people to rely on an imagined only-JSON-ever promise when integrating cargo in scripts.