Open chloestefantsova opened 2 years ago
@munificent I think you wrote the current compiler result parsing code.
I agree, the name of the expectation is really confusing in this case.
The set of test expectations and their names grew organically over time. Back in the days when Dart was a scripting language with an optional type system, the test runner was very focused on the idea that "compile time error" == "test failed". So there is an expectation named CompileTimeError
that means "this test failed". But now we have thousands of tests whose job is to validate the behavior of the compiler's error reporting. When one of those tests fails, it's confusing to use CompileTimeError
because that implicitly means "a compile time error occurred *when none were expected".
At some point, someone added MissingCompileTimeError
to handle that. It means "this test failed because it should have reported a compile time error but none were reported". That's the status used for the old negative tests and still today for multitests. That made sense at the time because literally the only thing those tests were able to validate was "Did any compile error get reported?"
When I added support for static error tests, I gave the test runner the ability to much more precisely detect how the compile-time errors reported by a test differed from what was expected. But I didn't add a new expectation for that because I didn't want to change the outcome of all of the existing tests, and I thought it would be even more confusing to have yet another outcome.
If it were up to me and I had the time (which I definitely don't), I would dramatically simplify the statuses down to:
Pass
: The test met all of its compile-time and runtime expectations.CompileTimeFailure
: The test failed to meet a compile-time expectation. This could mean it shouldn't have had a compile time error and did, it should have had a compile-time error and did not, or that the actual compile-time errors were not the same as the expected ones.RuntimeTimeFailure
: The test failed to meet a runtime expectation. This means it was expected to run to completion but threw an unexpected exception.(And then a few other statuses for other stuff like truncated output, timeouts, etc.)
I think it's needlessly confusing to try to encode the tests's expectation into the status itself. If we have separate statuses for "expected no compile-time error and got some" versus "expected compile-time error and got none/wrong" then when that test fails, users have to do some weird mental triple negation to actually understood what happened. It's better to just say "the test didn't do what it was supposed to do at compile-time" and then leave it up to the output to describe the divergence.
(I will note that I think the test output is pretty clear here. It talks in terms of "failure" which is clearly relative to the test's expectation instead of "error" which you need to know the context to interpret. And for each compile error mismatch, it explains what's counter to the test's expectations.)
@munificent wrote:
At some point, someone added
MissingCompileTimeError
to handle that. It means "this test failed because it should have reported a compile time error but none were reported".
But that wasn't the actual behavior, as far as I understood: The test run did report every expected compile-time error. The only unexpected behavior was that several additional compile-time errors were reported. So that's a failure to do what is expected, but it doesn't seem to match MissingCompileTimeError
.
users have to do some weird mental triple negation to actually understood what happened
As a user I would prefer more fine-grained status than just the three proposed general ones. In the latter case I would have to go and investigate each failing test case one-by-one to prioritize the work on fixing them.
I don't think we can put more information into the test status, so I agree with Bob that reducing to just compile-time error might be best. I think users need to look at the test log for tests that report compile-time errors, to see what is actually failing. To give more detailed errors as test results, I would think the different lines with errors would have to be reported as individual subtests, so a test results line for each of them.
The one simple change might be to change whether a failing static test is reported as a CompileTimeError or MissingCompileTimeError based on whether it has unexpected static errors or not - if it has unexpected ones, it would never be a MissingCompileTimeError.
But that wasn't the actual behavior, as far as I understood: The test run did report every expected compile-time error. The only unexpected behavior was that several additional compile-time errors were reported.
Yes, that's the behavior now. But at the point in time that MissingCompileTimeError
was added to the test runner, it only meant "Expected some kind of compile-time error and none were reported." When I added support for static error tests, I generalized that status to mean "the compile-time errors reported were not what was expected". The result is that the name is now confusing in cases like this.
I encountered an interesting case of test status reporting. In the test case
co19/LanguageFeatures/Enhanced-Enum/grammar_A01_t02
the CFE reports all of the expected errors and three unexpected ones, but the testing infrastructure reports the test asMissingCompileTimeError
. This status seems confusing, and it think it would be better to report something likeUnexpectedCompileTimeError
in case all of the discrepancies are only due to the unexpected errors./cc @eernstg