rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
98.89k stars 12.78k forks source link

Tracking issue for libtest JSON output #49359

Open Gilnaa opened 6 years ago

Gilnaa commented 6 years ago

Added in https://github.com/rust-lang/rust/pull/46450 Available in nightly behind a -Z flag.

johnterickson commented 5 years ago

@nrc You mentioned "The long-term solution is going to be deeply intertwined with pluggable test runners" in #46450. Does that mean that JSON won't be stabilized until there are pluggable test runners?

I'm trying to figure out if it makes sense for me to continue looking at https://github.com/rust-lang/rust/issues/51924 right now, or if I should hold off for the time being. For example, I'm looking at adding test durations to the JSON (https://github.com/rust-lang/rust/compare/master...johnterickson:testjson) and converting the output to JUnit (https://github.com/johnterickson/cargo2junit/).

djrenren commented 5 years ago

Yeah I think we should stabilize something like this independently of custom test frameworks. It makes sense that the included test framework would have a programmatically readable output. Libtest's conceptual model is pretty simple and most likely won't undergo any drastic changes so I'm all for stabilizing something. I'd prefer to have a JSON Schema that describes the structure before stabilization. We'd also need to audit to ensure it's resilient to any minor changes or additions over time.

@alexcrichton thoughts?

alexcrichton commented 5 years ago

Seems like a plausible addition to me! So long as it's designed careful I think it'd be good to go

epage commented 5 years ago

From the experience of creating naiive serde structs / enums for libtest's output, here are my thoughts from translating it exactly how it is written in libtest (rather than inventing my own schema that happens to be compatible):

My PR where you can see the data structure's I created: https://github.com/crate-ci/escargot/pull/24/files

Ideas to consider

andoriyu commented 5 years ago

I think it's worth adding things like package and type of test. Type of test: unit test, doc test, integration test.

The package is something like: module for unit test, resource path for doc test (crate::foo::Bar), and filename for integration test.

epage commented 5 years ago

Since libtest tracks test times (at minimum, reporting if its taking too long), it would be nice if that got included in the programmatic output so we can report it to the user, e.g. in JUnit. Alternatively, we'd have to timestamp the events as we read them.

johnterickson commented 5 years ago

@epage / @andoriyu Are one of you planning on proposing a better schema? I like what you're proposing.

rbtcollins commented 5 years ago

If I may add a semi-random thing; https://github.com/testing-cabal/subunit is a testing protocol thats been around for a while (one of a number) - the V2 even model is very similar in nature to your proposed JSON output schema. There is a rust implementation of the binary serialisation https://github.com/mtreinish/subunit-rust and I'm interested / planning on doing a JSON variant of V2 at some point for use with web stacks.

All of which to say two things I guess - consider thinking about chunking stdout/err, allowing additional streams, tags, and possibly, if its interesting, consider adopting the same model - I think it would be a super set of libtests current needs.

Of course, doing a new schema and having folk write adapters is complete fine too; I just saw this ticket and thought I'd offer this kibbitz :)

Alxandr commented 5 years ago

I'd also love to see improvements in this. Also, currently there seems to be bug in the json output: https://github.com/johnterickson/cargo2junit/pull/9

janderholm commented 5 years ago

I would also like duration output as part of the output. Furthermore when running multiple suites at the same time there's no way of knowing what test belong to what suite. It would be nice to name the suite somehow. Possibly connect it to the source file relative to the workspace root?

Here's a slimmed down example:

     Running target/debug/deps/rustrunner-e965cfdb4a6bd87d
{ "type": "suite", "event": "started", "test_count": 2 }
{ "type": "test", "event": "started", "name": "tests::leak" }
{ "type": "test", "event": "started", "name": "tests::test_true" }
{ "type": "test", "name": "tests::leak", "event": "ok" }
{ "type": "test", "name": "tests::test_true", "event": "ok" }
{ "type": "suite", "event": "failed", "passed": 2, "failed": 0, "allowed_fail": 0, "ignored": 0, "measured": 0, "filtered_out": 0 }
     Running target/debug/deps/subcrate-2da6bff97f5f00cf
{ "type": "suite", "event": "started", "test_count": 2 }
{ "type": "test", "event": "started", "name": "tests::it_works" }
{ "type": "test", "event": "started", "name": "tests::test_true" }
{ "type": "test", "name": "tests::it_works", "event": "ok" }
{ "type": "test", "name": "tests::test_true", "event": "ok" }
{ "type": "suite", "event": "ok", "passed": 2, "failed": 0, "allowed_fail": 0, "ignored": 0, "measured": 0, "filtered_out": 0 }
   Doc-tests subcrate
{ "type": "suite", "event": "started", "test_count": 0 }
{ "type": "suite", "event": "ok", "passed": 0, "failed": 0, "allowed_fail": 0, "ignored": 0, "measured": 0, "filtered_out": 0 }

Since both the parent crate and the subcrate have test modules called "test" and a test called "test_true there's no way of knowing which one failed.

gilescope commented 4 years ago

Are people happy with cargo test -- -Zunstable-options --format json ? Should we be looking to stabilse this argument? (I notice rustc calls the same argument --message-format)

Gilnaa commented 4 years ago

Is anyone actually using it?

I've made the PR 2.5 years ago, and at the time there was a talk about custom testing harnesses that would make this feature obsolete.

I did not see any feedback (neither positive nor negative), so I assumed it was just unused.

Anyway, one problem with the current format is that it doesn't output any timing information

gilescope commented 4 years ago

I think the custom test harnesses have focused on alternative ways for test runners. I can't see much sign on the custom reporting side. To be honest standardised reporting for different custom runners would be what I would expect all CIs would like, so to me it feels like a json test reporting format seems front and center.

From my side I'm looking at translating these into teamcity service messages so that the web UI can be realtime driven as to how many tests are done and how long each one took.

I was thinking the duration can be calculated if need be based on when the started / stopped message happens, but having tried running two tests where one has a sleep in it, both seem to only report back in the json once they're both finished (even with --test-threads=2). This prevents retrospectively calculating the duration. I'm assuming the tests are running in parallel and it's the json reporting that's not streaming them back as they happen?

So timing info would be good, but even better would be getting the test finished events to output closer to when the test actually finishes. (Who doesn't like real time feedback and status bars showing how long before all the tests are likely to be finished?)


UPDATE I think I had some very recursive tests (that happend to be calling cargo test themselves) - it works as expected if I ignore those tests.

Gilnaa commented 4 years ago

I actually get the expected behavior, but I'll see if I can add timestamps trivially

Gilnaa commented 4 years ago

Oh, there's already a flag that prints the elapsed time: --report-time

gilescope commented 4 years ago

When I was asking for this kind of stuff in '17 it was becasue I had a dream that one day I might be a full time rust dev and be in teams using teamcity for development. At the time it was purely speculative that getting tis kind of integration ready would be helpful.

Fast forward a few years and sometimes dreams (with a bit of hard work) do come true and we have teams using rust and python at work. I'm trying to see if we can get the rust integration to be at parity with the python integration. It's a good problem to have!

gilescope commented 4 years ago

Ok, that's all working perfectly for me -- many thanks! Time for me to write that converter to TeamCity service messages...

epage commented 4 years ago

Are people happy with cargo test -- -Zunstable-options --format json ? Should we be looking to stabilse this argument? (I notice rustc calls the same argument --message-format)

No, the json output is not great for working with in my experience. See my post earlier in this thread from me maintaining a Rust API over tests.

https://github.com/rust-lang/rust/issues/49359#issuecomment-467994590

You can find the crate for this over at https://crates.io/crates/escargot

andoriyu commented 4 years ago

Are people happy with cargo test -- -Zunstable-options --format json ? Should we be looking to stabilse this argument? (I notice rustc calls the same argument --message-format)

I'm using it in my cargo-suity that I use to generate junit reports. Obviously, I'd rather eliminate my shim and just have cargo generate jUnit output, but json output is "okay, I guess" alternative. That said, current output is not easy to consume for my needs.

Gilnaa commented 4 years ago

I actually opened a PR some time ago implementing a basic JUnit output. I just did that against the wrong repo.

https://github.com/rust-lang/libtest/pull/16

gilescope commented 4 years ago

To me the one thing that feels a little odd is that test suite doesn't have a name attribute. If we're running lots of integration tests and tests against various binaries (--all-targets) it would be good to group the results together under a test suite name.

A stretch goal is that consumers can do more if it's a comparison failure - i.e. if we had expected and actual outputs in the json that would enable downstream to do cool diffs etc. Idk how hard that is?

janderholm commented 4 years ago

I use it at work to translate the output from cargo to an XML based report format (Check | Unit testing framework for C) used by our CI framework. So very similar to cargo2junit mentioned above.

We don't exactly depend on there being JSON output. But we do need a way of generating custom outputs, or generating machine readable output, that would be just as good.

gilescope commented 4 years ago

@janderholm is there information not present in the json output that you would need that isn't there?

janderholm commented 4 years ago

@gilescope No. I was about to say the duration for each test, but --report-time mentioned above does what I want.

gilescope commented 4 years ago

@epage I felt your pain also with the nested enums of type="" and event="" - I don't think serde_json handles that easily - I switched back to using serde_json::Value. But I don't think there's anything wrong with the way it's currently done - it seems perfectly valid json. We just need better parsing libs :-) .

So outstanding issues:

Anything I've missed? Will give bench a go and see what issues I find integrating that into TC.

Alxandr commented 4 years ago

Has the invalid JSON been fixed already?

epage commented 4 years ago

@epage I felt your pain also with the nested enums of type="" and event="" - I don't think serde_json handles that easily - I switched back to using serde_json::Value. But I don't think there's anything wrong with the way it's currently done - it seems perfectly valid json. We just need better parsing libs :-) .

Let me take a step back. What are we standardizing? As far as I'm aware, libtest's json output is not formalized into a schema for people to have expectations to program against. We have to infer intent from behavior and hope that we didn't infer the wrong intent and be broken as the implementation changes. Defining that schema I think will make it easier to have conversations aspects of the format, like how enums are handled.

gilescope commented 4 years ago

I'm going to code against it as is, as if I'm not reading in json then I'm regexing the output of cargo test which is definitely worse. I'm cool with it being something that is subject to change - I don't think we should standardize what's not heavily used. I'm just trying to use it as is and get a feedback loop going.

@Alxandr no - I just spotted this today:

 { "type": "test", "name": "src\lib.rs - (line 46)", "event": "ok" }

It's pretty uncontentious that this needs fixing so I'll roll a fix for that so that strings are escaped (E.g. \ is escaped as \\) (unless anyone else wants to?)

edit: I'm also a little suspicious of the test name choice there -- ah it's a doc test.

Gilnaa commented 4 years ago

It's pretty uncontentious that this needs fixing so I'll roll a fix for that so that strings are escaped (E.g. \ is escaped as \\) (unless anyone else wants to?)

edit: I'm also a little suspicious of the test name choice there -- ah it's a doc test.

Odd, it should be handled https://github.com/rust-lang/rust/blob/master/library/test/src/formatters/json.rs#L192

Alxandr commented 4 years ago

Wouldn't it be better to just use a json library to write valid json? Instead of writing an ad-hoc json serializer that's not well tested?

Gilnaa commented 4 years ago

Back then we tried to use serde but it caused some weird bootstrap issues.

It might not be the case anymore

On Mon, Oct 12, 2020, 22:14 Aleksander Heintz notifications@github.com wrote:

Wouldn't it be better to just use a json library to write valid json? Instead of writing an ad-hoc json serializer that's not well tested?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/rust-lang/rust/issues/49359#issuecomment-707298771, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKHDU53KQ2OPC33OWDT7Y3SKNIR3ANCNFSM4EXISSUA .

Gilnaa commented 4 years ago

Sure, it's tests all the way down.

https://github.com/rust-lang/rust/blob/master/library/test/src/tests.rs#L643

On Tue, Oct 13, 2020, 10:08 Squirrel notifications@github.com wrote:

Nothing wrong with the escaping function. It's just we didn't call it for name, which is understandable. (We could do with a doc test test - does the test lib have tests or is that too recursive?)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/rust-lang/rust/issues/49359#issuecomment-707539287, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKHDUYZWN2L57T3FFW6WLDSKP4IRANCNFSM4EXISSUA .

gilescope commented 4 years ago

I see ignored tests can have a comment against them:

[ignore = "not yet implemented"]

Would there be any objection in us exposing that in the json for an ignored test?

Currently it looks like this:

{ 
   "type": "test", 
   "name": "tests::this_is_an_example_ignored_test", 
   "event": "ignored" 
}

Could we amend it to something like this:

{ 
   "type": "test", 
   "name": "tests::this_is_an_example_ignored_test", 
   "event": "ignored", 
   "reason": "not yet implemented" 
}

(remembering of course to escape it :-) )

gilescope commented 4 years ago

Btw, when we're running cargo bench we report that we're ignoring all the other tests:

{ "type": "test", "name": "tests::test", "event": "ignored" }

I'm not sure that's that helpful... finding an ignored benchmark is probably like finding a needle in a haystack...

mexus commented 3 years ago

Is there anything an interested-in-the-feature person can help with to drive the stabilization process further?

gilescope commented 3 years ago

More people using it in nightly I think is needed. @mexus what's your integration use case that you're keen on?

mexus commented 3 years ago

what's your integration use case that you're keen on?

well i'd love to use it with gitlab CI to have an instant overview over failed/ignored tests without having to "parse" raw output with a naked eye

mexus commented 3 years ago

More people using it in nightly I think is needed

Well.. it might be not that easy to achieve

andoriyu commented 3 years ago

More people using it in nightly I think is needed. @mexus what's your integration use case that you're keen on?

I don't think it's possible. It's such a niche feature. I only use it because I want to generate JUnit reports simply because those reporters are easily integrated into CI pipelines. If cargo was able to produce junit report instead, I wouldn't even use json output.

Gilnaa commented 3 years ago

@andoriyu

I tried adding a basic JUnit formatter 2 years ago, I just opened a PR against the wrong repo:

https://github.com/rust-lang/libtest/pull/16

If someone wants to take the time and update it, it should be possible to retarget it against the real repo

andoriyu commented 3 years ago

@andoriyu

I tried adding a basic JUnit formatter 2 years ago, I just opened a PR against the wrong repo:

https://github.com/rust-lang/libtest/pull/16

If someone wants to take the time and update it, it should be possible to retarget it against the real repo

Should this go through RFC first?

Gilnaa commented 3 years ago

The original JSON output RFC PR was closed with "just send an impl" iirc, but this was some years ago, so I can't tell.

spencergilbert commented 3 years ago

We'd love to help or champion getting JSON output stabilized (wanting to get JUnit compatible output). Is there any thing we can do to help push this forward?

gilescope commented 3 years ago

Use it on projects and report back with issues? Ideally there should be a way to use this with stable but being behind a -Z flag… at the moment I feel like we are in a catch 22 - it’s not going forward because not enough people using it and not enough use it because it’s not available on stable?

On Thu, 1 Jul 2021 at 17:35, Spencer Gilbert @.***> wrote:

We'd love to help or champion getting JSON output stabilized (wanting to get JUnit compatible output). Is there any thing we can do to help push this forward?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rust-lang/rust/issues/49359#issuecomment-872391816, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGEJCGULMSPGGF5OOXQYN3TVSKO3ANCNFSM4EXISSUA .

jonhoo commented 3 years ago

Anecdotally, see this discussion and its derivatives on testing unstable features on stable.

gilescope commented 3 years ago

@Gilnaa the reason not to output JUnit is that we'd be forcing other testing frameworks to also output JUnit. We should do it, but as an adapter of libtest's json output that lives in cargo.

While we might not be able to stabilise the json we have today, I don't see any downside to stabilise the cargo JUnit output because that's already a mature (de-facto) standard. (Then in 202_ once we've agreed on the json format we can stabilise that too.)

This seems like a good way of getting around the stability bind that we're in at the moment?

gilescope commented 3 years ago

I have a cargo PR that executes test crates in parallel ( https://github.com/rust-lang/cargo/pull/10052 ). Alex points out that there's no jobserver control mechanism between cargo and the test crates to constrain parallelism. If we want to be able to execute test crates in parallel, then in an ideal world, we need to include the ability to pass 'tokens' via stdin/stdout to the test frameworks in the same manner as cargo communicates with rustc ( https://github.com/alexcrichton/jobserver-rs ). I mention this here as it would obviously have to be part of the JSON output.

(Ironically I thought I'd see how far that PR could get without depending on the json test output as that is currently unstable, but it seems to do a 'propper job' we need json output and opt-in jobserver support by the test runner.)

Yuri6037 commented 2 years ago

I have a bunch of pipelines on GitLab for my open source organization (BlockProject 3D) which uses the json formatting for cargo bench, cargo test and cargo clippy.

Some of these pipelines run every day.

I've built a tool to integrate with cargo, based on rust-gitlab-ci: https://gitlab.com/TobiP64/rust-gitlab-ci. My tool is named cargo-gitlab and automatically runs cargo test, cargo bench, cargo audit and cargo clippy. I've even built a HTML generator out of cargo clippy. Link to my tool: https://gitlab.com/bp3d/cargo-gitlab.

This json format is very important in my use case because it's much easier and much faster to see test results and other reports directly on the MR UI and the Pipeline UI. It's especially useful when checking reports from a mobile phone, on which it may be harder to check the entire build log.

pnkfelix commented 2 years ago

Visiting for T-compiler backlog bonanza.

We're not sure this belongs in our bucket. Kicking it over to T-libs-api instead.

@rustbot label: -T-compiler +T-libs-api

gilescope commented 2 years ago

Belongs to the test team but alas that doesn’t exist yet.

On Fri, 27 May 2022 at 15:55, Felix S Klock II @.***> wrote:

Visiting for T-compiler backlog bonanza.

We're not sure this belongs in our bucket. Kicking it over to T-libs-api instead.

@rustbot https://github.com/rustbot label: -T-compiler +T-libs-api

— Reply to this email directly, view it on GitHub https://github.com/rust-lang/rust/issues/49359#issuecomment-1139696732, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGEJCCRBZYNSWDTR5THJSLVMDO7XANCNFSM4EXISSUA . You are receiving this because you were mentioned.Message ID: @.***>