web-platform-tests / wpt-metadata

Out-of-tree metadata for wpt
36 stars 47 forks source link

Porting TestExpectations to WPT Metadata #481

Open KyleJu opened 4 years ago

KyleJu commented 4 years ago

Per Dave Tapuska's suggestions, WPT Metadata should be a subset of TestExpectations, but with a focus on chrome-specific test failures. The triage information in TestExpectations should be ported continuously to WPT Metadata.

For now, we can write a ad-hoc script to port the existing triage information to WPT metadata and think of an automated way going forward (e.g. through a bot).

KyleJu commented 4 years ago

The portable expectations in TestExpectations should meet the following criteria:

KyleJu commented 4 years ago

According to failing-tests, TestExpectations records tests that cannot be rebaselined. I suspect the Chrome-specific failures can go unnoticed during the import process

stephenmcgruer commented 4 years ago

Results is not [ Skip ] (maybe ideally [ Failure ] only? Exclude flaky tests);

[ Timeout ] also seems like it would be useful?

Flakes may also be useful, but are obviously harder to track.

KyleJu commented 4 years ago

TestExpectations has been ported to WPT Metadata in https://github.com/web-platform-tests/wpt-metadata/pull/278. The selecting criteria is mentioned in the comment above, but only include [ Failure ] and [ Timeout ]. Flaky tests are not ported at this point.

Per our discussions offline, TestExpectations file only records reference test failures, flaky tests and non-deterministic tests. New (or Chrome-specific) failures could most likely go unnoticed during the WPT import process as they are reabaselined automatically. As a result, WPT Metadata isn't a subset of TestExpectations, and TestExpectations is not the single source of truth for Chrome test failures.

Going forward, we should figure out a way to port candidates from TestExpectations to WPT Metadata continuously, e,g, a bot.

KyleJu commented 4 years ago

A list of portable NeverFixTests candidates have been identified using similar criteria. However, none of them are Chrome-specific failures. I will circle back to this issue when we expand our data ingestion to include all Chrome failures

KyleJu commented 4 years ago

A list of portable NeverFixTests candidates have been identified using similar criteria. However, none of them are Chrome-specific failures. I will circle back to this issue when we expand our data ingestion to include all Chrome failures

@stephenmcgruer FYI since this issue is raised by the Layout team. Happy to prioritize it if necessary

stephenmcgruer commented 4 years ago

(Why is this issue in wpt.fyi not wpt-metadata?)

I've been playing again with importing TestExpectations via https://github.com/web-platform-tests/wpt-metadata/pull/473, after questions from folks who reasonably don't want to retriage tests they have already marked in TestExpectations. From that PR, I have uploaded data for all of css/css-* as PRs:

I then went through the newly linked bugs for problematic cases. Here are the general problems I found:

  1. Tests linked to generic 'import this WPT directory' bugs, which are against what wpt.fyi triage was trying to achieve (useful triage data).
  2. Tests linked to bugs that are run_web_tests.py-specific (e.g. lack of fuzzy-reftest support)
  3. Tests generally linked to bugs that are marked Fixed (or duplicates of Fixed bugs), despite the fact that test failures are still linked to them.
    • In a few cases this turns out to be deliberate, though arguably the test should be in NeverFixTests then.

Also, as a final note, if we do make a regular thing of importing from TestExpectations, we will likely also need to answer:

  1. How to deal with 'retriages', aka when a test failure changes crbug in TestExpectations (or is removed entirely!). We don't look at the Chromium commit diffs here, just the file, so we don't know when this happened.
    • Perhaps we could mark entries as 'came from TestExpectations' and then remove them if they no longer exist there? Hacky!
  2. How to deal with tests that pass on wpt.fyi but are in TestExpectations. So far we've just been ignoring that, maybe its fine to do so.

Thoughts on how we might resolve these welcome; the answer may be that we need to cleanup TestExpectations first.

Hexcles commented 4 years ago

(Why is this issue in wpt.fyi not wpt-metadata?)

Good point. I've moved the issue to wpt-metadata. (Old issue links should continue to work.)