mknyszek commented 1 year ago

Currently the go test -json provides no way to disambiguate output produced by (*testing.T).Error{,f} and (*testing.T).Fatal{,f} from other test logging. This fact makes it difficult for CI systems to identify what part of the output is most pertinent for constructing summaries of test failures.

Consider a test that failed, producing a lot of log messages as a result. Reporting the full log of the test can often be distracting and overly specific, resulting in many CI systems (such as LUCI) not constructing test failure summaries for Go tests at all.

Heuristics such as taking the last line logged before a test fails work often, but not often enough to fully solve the problem. Enough tests call (*testing.T).Error{,f} and fail later (after more log output) that this heuristic is unreliable. This works even less well when considering tests that crash or panic: that last line is often going to be some random line in the stack trace.

I propose adding a boolean field to "output" actions in the JSON produced by cmd/test2json called "Error". This field would be "true" for any "output" actions that were produced via (*testing.T).Error{,f} and (*testing.T).Fatal{,f}. This field would be also be "true" for the first few lines of a crash (but notably not the stack trace itself).

For example:

{"Time":"2023-09-19T15:19:12.447816654Z","Action":"output","Package":"golang.org/x/exp/trace","Test":"TestHeap","Output":"--- FAIL: TestHeap (0.00s)\n","Error":false}
{"Time":"2023-09-19T15:19:12.449904543Z","Action":"output","Package":"golang.org/x/exp/trace","Test":"TestHeap","Output":"panic: ahhh [recovered]\n","Error":true}
{"Time":"2023-09-19T15:19:12.449923813Z","Action":"output","Package":"golang.org/x/exp/trace","Test":"TestHeap","Output":"\tpanic: ahhh\n","Error":true}
{"Time":"2023-09-19T15:19:12.449928103Z","Action":"output","Package":"golang.org/x/exp/trace","Test":"TestHeap","Output":"\n","Error":false}
{"Time":"2023-09-19T15:19:12.449932593Z","Action":"output","Package":"golang.org/x/exp/trace","Test":"TestHeap","Output":"goroutine 6 [running]:\n","Error":false}
{"Time":"2023-09-19T15:19:12.449936283Z","Action":"output","Package":"golang.org/x/exp/trace","Test":"TestHeap","Output":"testing.tRunner.func1.2({0x51ad20, 0x577910})\n","Error":false}
{"Time":"2023-09-19T15:19:12.449940423Z","Action":"output","Package":"golang.org/x/exp/trace","Test":"TestHeap","Output":"\t/usr/local/google/home/mknyszek/work/go-1/src/testing/testing.go:1545 +0x24a\n","Error":false}
{"Time":"2023-09-19T15:19:12.449953903Z","Action":"output","Package":"golang.org/x/exp/trace","Test":"TestHeap","Output":"testing.tRunner.func1()\n","Error":false}

Which lines of a crash get marked would be entirely up to cmd/test2json. Because it lives in the standard library, it can reasonably be coupled (with tests) to the crash output that the Go runtime produces.

Testing binaries' -test.v=test2json output would need to be adjusted such thatcmd/test2json could interpret errors correctly. For this, I propose the use of a new marker character (I am not picky as to which) at the beginning of a frame (delineated currently by ^V (0x16)).

bcmills commented 1 year ago

cmd/test2json consumes the stdout from a test process as its input. Presumably the testing package would need to add some kind of new framing to communicate whether the text that follows indicates an error?

mknyszek commented 1 year ago

Yup! I don't know the history of how ^V (0x16) was chosen as the current framing marker, but we would need another one probably. I'm not really sure how to pick another one, though. I assume it should probably be invisible again? I'm open to suggestions.

patrickmeiring commented 1 year ago

The proposal looks good to me from a LUCI perspective.

Implementation-wise, with the frame markers - is there any concern about race conditions when tests are run in parallel? Or maybe this is the best that is available with the current implementation.

Beyond this proposal - if I can offer a CI perspective on what sort of structured data may be useful for a language like golang to generate -:

Structured reasons for build failures. I note there is an existing proposal in this regard: https://github.com/golang/go/issues/23037
Stack traces for test failures (avoiding the need to scrape logs, e.g. if the user provides an error directly to t.Error() or if the test panics). We do not use these in LUCI Analysis yet, but there have been requests along this line. Having the stack traces seperated out can be useful for presentation reasons, to cluster them for test bug triage, and/or to match stack traces in seen test failures with stack traces seen out in the wild from crash telemetry.

One of the best test frameworks I have come across so far in terms of structured result output is Tast [1], the integration-testing framework for ChromeOS. It captures the time, reason, file and stack trace for all errors that cause a test to fail [2].

[1] https://chromium.googlesource.com/chromiumos/platform/tast/ [2] https://source.chromium.org/chromium/infra/infra/+/main:go/src/infra/tools/result_adapter/tast.go;l=63

patrickmeiring commented 1 year ago

A few further thoughts:

Where long output is produced, is it guaranteed that all output to one .Fatal() or .Error() call will be output on one test2json output line? I was under the impression that test2json pulls from stdout which is buffered. If one error may be split over two or more test2json output lines, could this affect the ability to recover the original errors? (observing one test may produce multiple errors)
I note the proposal does not specify if the error is fatal or not (i.e. comes from .Fatal(...) or .Error(...)). For failure reasons in LUCI Analysis, we typically capture the first fatal error, or the first error otherwise: https://source.chromium.org/chromium/infra/infra/+/main:go/src/go.chromium.org/luci/resultdb/proto/v1/failure_reason.proto;l=30?q=resultdb%20failure%20reason
- I think this is because fatal errors are generally seen as more interesting (panics, etc.)
- But you could always go with the first error if you prefer. The main thing would be to be consistent with the uploaded failure reason.

rsc commented 1 year ago

This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group

rsc commented 1 year ago

This seems reasonable, and following the existing framing (finding another little-used control character) seems fine too.

rsc commented 1 year ago

Have all remaining concerns about this proposal been addressed?

rsc commented 12 months ago

It looks like ^N and ^O are available and make a reasonable amount of sense, as these things go. ^N means start error text ("shift in") and ^O means back to regular text ("shift out").

rsc commented 12 months ago

@patrickmeiring For LUCI, I think first error makes the most sense and is the most consistent. Fatal is defined as Error+FailNow, and some programs or library code do call FailNow themselves. Code should not distinguish Error-generated messages from Fatal-calling-Error-generated messages.

rsc commented 12 months ago

@mknyszek I'm not sure about the part where the first few lines of a crash are tagged "Error":true. It seems clearer conceptually and certainly easier implementation-wise to only tag messages from t.Fatal/t.Error with "Error":true. To tag runtime crashes you'd have to make the runtime emit the ^N and ^O or else have some easily-fooled pattern matching in test2json.

61839 is (now) about reporting a non-zero exit status in -json output. Perhaps that would be enough for crashes? You'd see the exit status and click on the full output and see the crash text.

patrickmeiring commented 12 months ago

@rsc

Re: Selecting first error - that sounds fine to me.
Re: test crashes - extracting the reasons for these would be useful as this is one of the main ways tests fail. Is there no standard panic recovery code path used in go test that we can hook into?
Re: Have all remaining concerns about this proposal been addressed? I am seeking clarification on the behaviour if the program emits multiple errors, e.g.
```
t.Error(strings.Repeat("a", 1_000_000))
t.Error(strings.Repeat("b", 1_000_000))
```
Will it be possible to identify there were two errors, and to distinguish error a (with possibly long output) from error b? See comment about buffering above. @mknyszek

rsc commented 12 months ago

Will it be possible to identify there were two errors, and to distinguish error a (with possibly long output) from error b?

Yes, there will be two ^N...^O sequences in the output.

patrickmeiring commented 12 months ago

Yes, there will be two ^N...^O sequences in the output.

How will these sequences be distinguished in cmd/test2json output?

It looks to me like cmd/test2json is stripping out those control characters and slicing the sequences into an arbitrary number of JSON messages.

E.g. the following example output from comment 1:

{"Time":"2023-09-19T15:19:12.449904543Z","Action":"output","Package":"golang.org/x/exp/trace","Test":"TestHeap","Output":"panic: ahhh [recovered]\n","Error":true}
{"Time":"2023-09-19T15:19:12.449923813Z","Action":"output","Package":"golang.org/x/exp/trace","Test":"TestHeap","Output":"\tpanic: ahhh\n","Error":true}

Looks like it is all the same error based on the "Output" field.

But this is structurally the same as the output I would expect from two errors (aaaaaaa...) and (bbbbb.....):

{"Time":"2023-09-19T15:19:12.449904543Z","Action":"output","Package":"golang.org/x/exp/trace","Test":"TestHeap","Output":"aaaaaaaaaaa","Error":true}
{"Time":"2023-09-19T15:19:12.449923813Z","Action":"output","Package":"golang.org/x/exp/trace","Test":"TestHeap","Output":"bbbbbbbbbb","Error":true}

aclements commented 11 months ago

To tag runtime crashes you'd have to make the runtime emit the ^N and ^O or else have some easily-fooled pattern matching in test2json.

Re: test crashes - extracting the reasons for these would be useful as this is one of the main ways tests fail. Is there no standard panic recovery code path used in go test that we can hook into?

I think we could easily tag panics that are recovered and re-panicked by the testing package itself. That doesn't capture runtime fatal panics or panics from other goroutines, but I imagine that would get the vast majority of test panics and should be easy to implement in the testing package.

rsc commented 11 months ago

@patrickmeiring I see what you mean. We need two different ways to flag an output line: "start of error" and "continuation of error". Perhaps we should use

"OutputType": "error" "OutputType": "error-continue"

and that leaves the door open for other kinds of OutputType as well, like we could have "OutputType": "frame" for all the framing text ("=== RUN" and so on) so test systems can decide to gray it out.

?

rsc commented 11 months ago

Have all remaining concerns about this proposal been addressed?

In go test -json output, the "Action":"output" lines would be annotated with an optional new field "OutputType" giving the kind of output included in the entry. The initial kinds would be:

“error”, meaning text produced by t.Error/t.Fatal
“error-continue”, meaning a continuation line from the previous “error”
“frame”, meaning framing text like “=== RUN TestCase” that is already expressed more directly in other events

The specific implementation of “error” and “error-continue” would be that in test2json mode the test would print error text wrapped in ^N...^O, and test2json would recognize and remove these, the same as it does today for ^V around framing.

patrickmeiring commented 11 months ago

Have all remaining concerns about this proposal been addressed?

The only open question I can see is whether to capture panics on other (non-test) goroutines and report them as errors. My take here is:

If a panic on another goroutine crashes the go test process instead of just failing a test, I think it is fine to not frame them with ^N and ^O (although it may still be useful to do so).
On the other hand, if a panic on an other goroutine is attributed to a test and ends up failing that test, then it would be better to frame the error so it can be reported as the test failure reason.

Either way, the resolution of this issue need not block the proposal as even a partial ability to extract failure reasons will be a significant improvement over the status quo.

rsc commented 11 months ago

Based on the discussion above, this proposal seems like a likely accept. — rsc for the proposal review group

In go test -json output, the "Action":"output" lines would be annotated with an optional new field "OutputType" giving the kind of output included in the entry. The initial kinds would be:

“error”, meaning text produced by t.Error/t.Fatal
“error-continue”, meaning a continuation line from the previous “error”
“frame”, meaning framing text like “=== RUN TestCase” that is already expressed more directly in other events

The specific implementation of “error” and “error-continue” would be that in test2json mode the test would print error text wrapped in ^N...^O, and test2json would recognize and remove these, the same as it does today for ^V around framing.

rsc commented 11 months ago

No change in consensus, so accepted. 🎉 This issue now tracks the work of implementing the proposal. — rsc for the proposal review group

In go test -json output, the "Action":"output" lines would be annotated with an optional new field "OutputType" giving the kind of output included in the entry. The initial kinds would be:

“error”, meaning text produced by t.Error/t.Fatal
“error-continue”, meaning a continuation line from the previous “error”
“frame”, meaning framing text like “=== RUN TestCase” that is already expressed more directly in other events

The specific implementation of “error” and “error-continue” would be that in test2json mode the test would print error text wrapped in ^N...^O, and test2json would recognize and remove these, the same as it does today for ^V around framing.

firelizzard18 commented 2 months ago

@rsc Should OutputType be empty/omitted for regular output events? As in if I t.Log("foo") will the event for that have an OutputType?

gopherbot commented 2 months ago

Change https://go.dev/cl/601535 mentions this issue: testing: annotate output text type

firelizzard18 commented 1 week ago

Design update

^N means start error text ("shift in") and ^O means back to regular text ("shift out").

^N is shift out and ^O is shift in, respectively, so I'm using ^O for start of error and ^N for end of error.
I added ^[ (escape) to escape control/framing characters (^V, ^O, ^N, and ^[ itself). @adonovan asked about escaping ^O within error messages and I discovered that the current use of ^V does not include escaping which leads to faulty -json output in the test2json tests, so I added escaping to all output when framing is enabled.

golang / go

testing: annotate output text type #62728

61839 is (now) about reporting a non-zero exit status in -json output. Perhaps that would be enough for crashes? You'd see the exit status and click on the full output and see the crash text.

Design update