Open Gilnaa opened 6 years ago
@nrc You mentioned "The long-term solution is going to be deeply intertwined with pluggable test runners" in #46450. Does that mean that JSON won't be stabilized until there are pluggable test runners?
I'm trying to figure out if it makes sense for me to continue looking at https://github.com/rust-lang/rust/issues/51924 right now, or if I should hold off for the time being. For example, I'm looking at adding test durations to the JSON (https://github.com/rust-lang/rust/compare/master...johnterickson:testjson) and converting the output to JUnit (https://github.com/johnterickson/cargo2junit/).
Yeah I think we should stabilize something like this independently of custom test frameworks. It makes sense that the included test framework would have a programmatically readable output. Libtest's conceptual model is pretty simple and most likely won't undergo any drastic changes so I'm all for stabilizing something. I'd prefer to have a JSON Schema that describes the structure before stabilization. We'd also need to audit to ensure it's resilient to any minor changes or additions over time.
@alexcrichton thoughts?
Seems like a plausible addition to me! So long as it's designed careful I think it'd be good to go
From the experience of creating naiive serde structs / enums for libtest's output, here are my thoughts from translating it exactly how it is written in libtest (rather than inventing my own schema that happens to be compatible):
type
field conflates suite/test being finished and why it finished. This is annoying for if you want to get information regardless of the completion status/.bench
event is too narrowly focused on the output of the current implementation and not on what might be wantedMy PR where you can see the data structure's I created: https://github.com/crate-ci/escargot/pull/24/files
Ideas to consider
event
and type
fields. Unsure of the value of thistype
field into type
and status
and define more fields as being available for all types. For example, it could be useful to programmatically report stdout
for a successful test and leave it to the rendering engine to decide whether to ignore it or not. This doesn't mean all test implementations need to report all of the fields, but define it as possible.bench
implementations, like criterion, to review the bench
schema.I think it's worth adding things like package and type of test. Type of test: unit test, doc test, integration test.
The package is something like: module for unit test, resource path for doc test (crate::foo::Bar), and filename for integration test.
Since libtest tracks test times (at minimum, reporting if its taking too long), it would be nice if that got included in the programmatic output so we can report it to the user, e.g. in JUnit. Alternatively, we'd have to timestamp the events as we read them.
@epage / @andoriyu Are one of you planning on proposing a better schema? I like what you're proposing.
If I may add a semi-random thing; https://github.com/testing-cabal/subunit is a testing protocol thats been around for a while (one of a number) - the V2 even model is very similar in nature to your proposed JSON output schema. There is a rust implementation of the binary serialisation https://github.com/mtreinish/subunit-rust and I'm interested / planning on doing a JSON variant of V2 at some point for use with web stacks.
All of which to say two things I guess - consider thinking about chunking stdout/err, allowing additional streams, tags, and possibly, if its interesting, consider adopting the same model - I think it would be a super set of libtests current needs.
Of course, doing a new schema and having folk write adapters is complete fine too; I just saw this ticket and thought I'd offer this kibbitz :)
I'd also love to see improvements in this. Also, currently there seems to be bug in the json output: https://github.com/johnterickson/cargo2junit/pull/9
I would also like duration output as part of the output. Furthermore when running multiple suites at the same time there's no way of knowing what test belong to what suite. It would be nice to name the suite somehow. Possibly connect it to the source file relative to the workspace root?
Here's a slimmed down example:
Running target/debug/deps/rustrunner-e965cfdb4a6bd87d
{ "type": "suite", "event": "started", "test_count": 2 }
{ "type": "test", "event": "started", "name": "tests::leak" }
{ "type": "test", "event": "started", "name": "tests::test_true" }
{ "type": "test", "name": "tests::leak", "event": "ok" }
{ "type": "test", "name": "tests::test_true", "event": "ok" }
{ "type": "suite", "event": "failed", "passed": 2, "failed": 0, "allowed_fail": 0, "ignored": 0, "measured": 0, "filtered_out": 0 }
Running target/debug/deps/subcrate-2da6bff97f5f00cf
{ "type": "suite", "event": "started", "test_count": 2 }
{ "type": "test", "event": "started", "name": "tests::it_works" }
{ "type": "test", "event": "started", "name": "tests::test_true" }
{ "type": "test", "name": "tests::it_works", "event": "ok" }
{ "type": "test", "name": "tests::test_true", "event": "ok" }
{ "type": "suite", "event": "ok", "passed": 2, "failed": 0, "allowed_fail": 0, "ignored": 0, "measured": 0, "filtered_out": 0 }
Doc-tests subcrate
{ "type": "suite", "event": "started", "test_count": 0 }
{ "type": "suite", "event": "ok", "passed": 0, "failed": 0, "allowed_fail": 0, "ignored": 0, "measured": 0, "filtered_out": 0 }
Since both the parent crate and the subcrate have test modules called "test" and a test called "test_true there's no way of knowing which one failed.
Are people happy with cargo test -- -Zunstable-options --format json
? Should we be looking to stabilse this argument? (I notice rustc
calls the same argument --message-format
)
Is anyone actually using it?
I've made the PR 2.5 years ago, and at the time there was a talk about custom testing harnesses that would make this feature obsolete.
I did not see any feedback (neither positive nor negative), so I assumed it was just unused.
Anyway, one problem with the current format is that it doesn't output any timing information
I think the custom test harnesses have focused on alternative ways for test runners. I can't see much sign on the custom reporting side. To be honest standardised reporting for different custom runners would be what I would expect all CIs would like, so to me it feels like a json test reporting format seems front and center.
From my side I'm looking at translating these into teamcity service messages so that the web UI can be realtime driven as to how many tests are done and how long each one took.
I was thinking the duration can be calculated if need be based on when the started / stopped message happens, but having tried running two tests where one has a sleep in it, both seem to only report back in the json once they're both finished (even with --test-threads=2). This prevents retrospectively calculating the duration. I'm assuming the tests are running in parallel and it's the json reporting that's not streaming them back as they happen?
So timing info would be good, but even better would be getting the test finished events to output closer to when the test actually finishes. (Who doesn't like real time feedback and status bars showing how long before all the tests are likely to be finished?)
UPDATE I think I had some very recursive tests (that happend to be calling cargo test themselves) - it works as expected if I ignore those tests.
I actually get the expected behavior, but I'll see if I can add timestamps trivially
Oh, there's already a flag that prints the elapsed time: --report-time
When I was asking for this kind of stuff in '17 it was becasue I had a dream that one day I might be a full time rust dev and be in teams using teamcity for development. At the time it was purely speculative that getting tis kind of integration ready would be helpful.
Fast forward a few years and sometimes dreams (with a bit of hard work) do come true and we have teams using rust and python at work. I'm trying to see if we can get the rust integration to be at parity with the python integration. It's a good problem to have!
Ok, that's all working perfectly for me -- many thanks! Time for me to write that converter to TeamCity service messages...
Are people happy with cargo test -- -Zunstable-options --format json ? Should we be looking to stabilse this argument? (I notice rustc calls the same argument --message-format)
No, the json output is not great for working with in my experience. See my post earlier in this thread from me maintaining a Rust API over tests.
https://github.com/rust-lang/rust/issues/49359#issuecomment-467994590
You can find the crate for this over at https://crates.io/crates/escargot
Are people happy with
cargo test -- -Zunstable-options --format json
? Should we be looking to stabilse this argument? (I noticerustc
calls the same argument--message-format
)
I'm using it in my cargo-suity
that I use to generate junit reports. Obviously, I'd rather eliminate my shim and just have cargo generate jUnit output, but json output is "okay, I guess" alternative. That said, current output is not easy to consume for my needs.
I actually opened a PR some time ago implementing a basic JUnit output. I just did that against the wrong repo.
To me the one thing that feels a little odd is that test suite doesn't have a name attribute. If we're running lots of integration tests and tests against various binaries (--all-targets) it would be good to group the results together under a test suite name.
A stretch goal is that consumers can do more if it's a comparison failure - i.e. if we had expected
and actual
outputs in the json that would enable downstream to do cool diffs etc. Idk how hard that is?
I use it at work to translate the output from cargo to an XML based report format (Check | Unit testing framework for C) used by our CI framework. So very similar to cargo2junit mentioned above.
We don't exactly depend on there being JSON output. But we do need a way of generating custom outputs, or generating machine readable output, that would be just as good.
@janderholm is there information not present in the json output that you would need that isn't there?
@gilescope No. I was about to say the duration for each test, but --report-time mentioned above does what I want.
@epage I felt your pain also with the nested enums of type="" and event="" - I don't think serde_json handles that easily - I switched back to using serde_json::Value
. But I don't think there's anything wrong with the way it's currently done - it seems perfectly valid json. We just need better parsing libs :-) .
So outstanding issues:
Anything I've missed? Will give bench a go and see what issues I find integrating that into TC.
Has the invalid JSON been fixed already?
@epage I felt your pain also with the nested enums of type="" and event="" - I don't think serde_json handles that easily - I switched back to using serde_json::Value. But I don't think there's anything wrong with the way it's currently done - it seems perfectly valid json. We just need better parsing libs :-) .
Let me take a step back. What are we standardizing? As far as I'm aware, libtest's json output is not formalized into a schema for people to have expectations to program against. We have to infer intent from behavior and hope that we didn't infer the wrong intent and be broken as the implementation changes. Defining that schema I think will make it easier to have conversations aspects of the format, like how enums are handled.
I'm going to code against it as is, as if I'm not reading in json then I'm regexing the output of cargo test which is definitely worse. I'm cool with it being something that is subject to change - I don't think we should standardize what's not heavily used. I'm just trying to use it as is and get a feedback loop going.
@Alxandr no - I just spotted this today:
{ "type": "test", "name": "src\lib.rs - (line 46)", "event": "ok" }
It's pretty uncontentious that this needs fixing so I'll roll a fix for that so that strings are escaped (E.g. \
is escaped as \\
)
(unless anyone else wants to?)
edit: I'm also a little suspicious of the test name choice there -- ah it's a doc test.
It's pretty uncontentious that this needs fixing so I'll roll a fix for that so that strings are escaped (E.g.
\
is escaped as\\
) (unless anyone else wants to?)edit: I'm also a little suspicious of the test name choice there -- ah it's a doc test.
Odd, it should be handled https://github.com/rust-lang/rust/blob/master/library/test/src/formatters/json.rs#L192
Wouldn't it be better to just use a json library to write valid json? Instead of writing an ad-hoc json serializer that's not well tested?
Back then we tried to use serde but it caused some weird bootstrap issues.
It might not be the case anymore
On Mon, Oct 12, 2020, 22:14 Aleksander Heintz notifications@github.com wrote:
Wouldn't it be better to just use a json library to write valid json? Instead of writing an ad-hoc json serializer that's not well tested?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/rust-lang/rust/issues/49359#issuecomment-707298771, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKHDU53KQ2OPC33OWDT7Y3SKNIR3ANCNFSM4EXISSUA .
Sure, it's tests all the way down.
https://github.com/rust-lang/rust/blob/master/library/test/src/tests.rs#L643
On Tue, Oct 13, 2020, 10:08 Squirrel notifications@github.com wrote:
Nothing wrong with the escaping function. It's just we didn't call it for name, which is understandable. (We could do with a doc test test - does the test lib have tests or is that too recursive?)
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/rust-lang/rust/issues/49359#issuecomment-707539287, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKHDUYZWN2L57T3FFW6WLDSKP4IRANCNFSM4EXISSUA .
I see ignored tests can have a comment against them:
Would there be any objection in us exposing that in the json for an ignored test?
Currently it looks like this:
{
"type": "test",
"name": "tests::this_is_an_example_ignored_test",
"event": "ignored"
}
Could we amend it to something like this:
{
"type": "test",
"name": "tests::this_is_an_example_ignored_test",
"event": "ignored",
"reason": "not yet implemented"
}
(remembering of course to escape it :-) )
Btw, when we're running cargo bench
we report that we're ignoring all the other tests:
{ "type": "test", "name": "tests::test", "event": "ignored" }
I'm not sure that's that helpful... finding an ignored benchmark is probably like finding a needle in a haystack...
Is there anything an interested-in-the-feature person can help with to drive the stabilization process further?
More people using it in nightly I think is needed. @mexus what's your integration use case that you're keen on?
what's your integration use case that you're keen on?
well i'd love to use it with gitlab CI to have an instant overview over failed/ignored tests without having to "parse" raw output with a naked eye
More people using it in nightly I think is needed
Well.. it might be not that easy to achieve
More people using it in nightly I think is needed. @mexus what's your integration use case that you're keen on?
I don't think it's possible. It's such a niche feature. I only use it because I want to generate JUnit reports simply because those reporters are easily integrated into CI pipelines. If cargo was able to produce junit report instead, I wouldn't even use json output.
@andoriyu
I tried adding a basic JUnit formatter 2 years ago, I just opened a PR against the wrong repo:
https://github.com/rust-lang/libtest/pull/16
If someone wants to take the time and update it, it should be possible to retarget it against the real repo
@andoriyu
I tried adding a basic JUnit formatter 2 years ago, I just opened a PR against the wrong repo:
https://github.com/rust-lang/libtest/pull/16
If someone wants to take the time and update it, it should be possible to retarget it against the real repo
Should this go through RFC first?
The original JSON output RFC PR was closed with "just send an impl" iirc, but this was some years ago, so I can't tell.
We'd love to help or champion getting JSON output stabilized (wanting to get JUnit compatible output). Is there any thing we can do to help push this forward?
Use it on projects and report back with issues? Ideally there should be a way to use this with stable but being behind a -Z flag… at the moment I feel like we are in a catch 22 - it’s not going forward because not enough people using it and not enough use it because it’s not available on stable?
On Thu, 1 Jul 2021 at 17:35, Spencer Gilbert @.***> wrote:
We'd love to help or champion getting JSON output stabilized (wanting to get JUnit compatible output). Is there any thing we can do to help push this forward?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rust-lang/rust/issues/49359#issuecomment-872391816, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGEJCGULMSPGGF5OOXQYN3TVSKO3ANCNFSM4EXISSUA .
Anecdotally, see this discussion and its derivatives on testing unstable features on stable.
@Gilnaa the reason not to output JUnit is that we'd be forcing other testing frameworks to also output JUnit. We should do it, but as an adapter of libtest's json output that lives in cargo.
While we might not be able to stabilise the json we have today, I don't see any downside to stabilise the cargo JUnit output because that's already a mature (de-facto) standard. (Then in 202_ once we've agreed on the json format we can stabilise that too.)
This seems like a good way of getting around the stability bind that we're in at the moment?
I have a cargo PR that executes test crates in parallel ( https://github.com/rust-lang/cargo/pull/10052 ). Alex points out that there's no jobserver control mechanism between cargo and the test crates to constrain parallelism. If we want to be able to execute test crates in parallel, then in an ideal world, we need to include the ability to pass 'tokens' via stdin/stdout to the test frameworks in the same manner as cargo communicates with rustc ( https://github.com/alexcrichton/jobserver-rs ). I mention this here as it would obviously have to be part of the JSON output.
(Ironically I thought I'd see how far that PR could get without depending on the json test output as that is currently unstable, but it seems to do a 'propper job' we need json output and opt-in jobserver support by the test runner.)
I have a bunch of pipelines on GitLab for my open source organization (BlockProject 3D) which uses the json formatting for cargo bench
, cargo test
and cargo clippy
.
Some of these pipelines run every day.
I've built a tool to integrate with cargo, based on rust-gitlab-ci
: https://gitlab.com/TobiP64/rust-gitlab-ci. My tool is named cargo-gitlab
and automatically runs cargo test
, cargo bench
, cargo audit
and cargo clippy
. I've even built a HTML generator out of cargo clippy
. Link to my tool: https://gitlab.com/bp3d/cargo-gitlab.
This json format is very important in my use case because it's much easier and much faster to see test results and other reports directly on the MR UI and the Pipeline UI. It's especially useful when checking reports from a mobile phone, on which it may be harder to check the entire build log.
Visiting for T-compiler backlog bonanza.
We're not sure this belongs in our bucket. Kicking it over to T-libs-api instead.
@rustbot label: -T-compiler +T-libs-api
Belongs to the test team but alas that doesn’t exist yet.
On Fri, 27 May 2022 at 15:55, Felix S Klock II @.***> wrote:
Visiting for T-compiler backlog bonanza.
We're not sure this belongs in our bucket. Kicking it over to T-libs-api instead.
@rustbot https://github.com/rustbot label: -T-compiler +T-libs-api
— Reply to this email directly, view it on GitHub https://github.com/rust-lang/rust/issues/49359#issuecomment-1139696732, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGEJCCRBZYNSWDTR5THJSLVMDO7XANCNFSM4EXISSUA . You are receiving this because you were mentioned.Message ID: @.***>
Added in https://github.com/rust-lang/rust/pull/46450 Available in nightly behind a
-Z
flag.