Closed ErichDonGubler closed 3 months ago
I don't think the CLI should expose this implementation detail; we're no longer "updating" the expected
property except deleting things, so it doesn't feel appropriate to expose this functionality there.
It seems like the implementation of the fix is simpler than documenting it clearly. For what it's worth, here's my understanding of what's going on:
When you bring in a new version of the CTS and do a try push, two things will generally happen:
Some tests will have been deleted, added, or renamed.
The tests that weren't deleted have some outcome like PASS, FAIL, TIMEOUT, SKIP, and so on. Some of these may be different from what's recorded in our expectations metadata.
When a test is renamed, we would like to move its expectations in the metadata files from its old name to its new name. But if you simply run update-expected
, then the resulting diff is the composition of both of those two effects, and discerning renames requires close reading.
The moz-webgpu-cts
tool could make this easier if it had a mode that compared the existing expectation metadata files with a set of test run reports and only processed additions and removals - test/subtest names that appeared or disappeared - without updating metadata about any test outcomes. This would produce a diff that simply deleted and inserted test/subtest entries, which would be much easier for a human to review and distinguish genuine deletions and additions from renames. Renames would appear as a deletion with a corresponding insertion, but pairing them up requires human judgment.
When the human did decide that a test was renamed, they could re-add the test/subtest's expectation metadata under its new name. Then, running the usual update-expected
command could adjust outcome metadata, with the renames taken into account, making it easier to identify regressions (and making a much clearer diff as well).
While working on a new iteration of re-vendoring WebGPU CTS, I noticed that a lot of test paths got changed up, without obvious correlations between moved tests, added tests, and removed tests. It was/is very tedious to get understand upstream changes in test structure, and
update-expected
adds significant noise when it also updates properties. I expect most people dealing with this to find solving such a problem difficult to reason about without some automated help. The only reason I was able to do it myself was because I happened to notice it was easier to tell what changes were if I left offexpected
property updates after runningupdate-expected --preset reset-all …
. Implementing that insight fully made things tractable again.I'd like to make something that facilitates just changing the structure of metadata in response to reports that may contain a test structure different from what's in current metadata. In terms of current implementation, this would be another preset in current report-processing logic, where we simply accept old
expected
metadata (where tests and subtests are still present), or insert new test paths into metadata that assumePASS
, ignoring actual test outcomes from processed reports (and thereby omit writing anexpected
property altogether).