Open KyleJu opened 4 years ago
The portable expectations in TestExpectations should meet the following criteria:
There exists a bug number;
Tags should be [], [ Linux ], [ Release ] or any combinations of these three;
Test name should start with external/wpt/
(exclude expectations for runtime flags);
Results is not [ Skip ] (maybe ideally [ Failure ] only? Exclude flaky tests);
The same expectation exists and fails in wpt.fyi (due to orphaned expectations, outdated WPT versions or Chrome versions).
According to failing-tests, TestExpectations records tests that cannot be rebaselined. I suspect the Chrome-specific failures can go unnoticed during the import process
Results is not [ Skip ] (maybe ideally [ Failure ] only? Exclude flaky tests);
[ Timeout ]
also seems like it would be useful?
Flakes may also be useful, but are obviously harder to track.
TestExpectations has been ported to WPT Metadata in https://github.com/web-platform-tests/wpt-metadata/pull/278. The selecting criteria is mentioned in the comment above, but only include [ Failure ]
and [ Timeout ]
. Flaky tests are not ported at this point.
Per our discussions offline, TestExpectations file only records reference test failures, flaky tests and non-deterministic tests. New (or Chrome-specific) failures could most likely go unnoticed during the WPT import process as they are reabaselined automatically. As a result, WPT Metadata isn't a subset of TestExpectations, and TestExpectations is not the single source of truth for Chrome test failures.
Going forward, we should figure out a way to port candidates from TestExpectations to WPT Metadata continuously, e,g, a bot.
A list of portable NeverFixTests candidates have been identified using similar criteria. However, none of them are Chrome-specific failures. I will circle back to this issue when we expand our data ingestion to include all Chrome failures
A list of portable NeverFixTests candidates have been identified using similar criteria. However, none of them are Chrome-specific failures. I will circle back to this issue when we expand our data ingestion to include all Chrome failures
@stephenmcgruer FYI since this issue is raised by the Layout team. Happy to prioritize it if necessary
(Why is this issue in wpt.fyi not wpt-metadata?)
I've been playing again with importing TestExpectations via https://github.com/web-platform-tests/wpt-metadata/pull/473, after questions from folks who reasonably don't want to retriage tests they have already marked in TestExpectations. From that PR, I have uploaded data for all of css/css-*
as PRs:
I then went through the newly linked bugs for problematic cases. Here are the general problems I found:
Also, as a final note, if we do make a regular thing of importing from TestExpectations, we will likely also need to answer:
Thoughts on how we might resolve these welcome; the answer may be that we need to cleanup TestExpectations first.
(Why is this issue in wpt.fyi not wpt-metadata?)
Good point. I've moved the issue to wpt-metadata. (Old issue links should continue to work.)
Per Dave Tapuska's suggestions, WPT Metadata should be a subset of TestExpectations, but with a focus on chrome-specific test failures. The triage information in TestExpectations should be ported continuously to WPT Metadata.
For now, we can write a ad-hoc script to port the existing triage information to WPT metadata and think of an automated way going forward (e.g. through a bot).