web-platform-tests / results-collection

Other
41 stars 46 forks source link

Defer results validation #531

Closed jugglinmike closed 6 years ago

jugglinmike commented 6 years ago

We have not published results for the Edge browser in two weeks. This is because the new parallel results collection system enforces "completeness" in a way that is subtly different than the previous system, and the new criteria is more strict. The technical details are outlined in the description of this pull request's commit message.

In some cases, Edge result sets fail validation due to intermittent network failures. These can be corrected by manually re-triggering the build for the offending "chunk" via the Buildbot web interface.

However, there are currently two chunks for which Edge consistently produces incomplete results (chunk numbers 19 and 52). I have previously attempted to address this issue in a generic way via a patch to the WPT CLI, but that patch was not accepted.

This patch side-steps the issue by deferring results validation until all chunks are complete. The goal is to publis marginally-incomplete results in a similar vein as we were at the beginning of March. With this in place, we can evaluate whether we want to invest time correcting the underlying problems (work which may not be relevant when/if we transition away from Sauce Labs).

Commit message:

Defer results validation until upload

This project enforces a 98% "completeness" threshold for results sets. (Any dataset containing results for fewer than 98% of the tests expected to be run in a given revision of WPT are not published online.)

Previously, when test results were collected as part of a single process, this condition was enforced one for the full data set.

When the system was distributed across many machines, the condition was enforced for each subset of tests. This change made the criteria for uploading more strict because failure in a single subset could block uploading of the entire results set, even if the overall data set still contained greater than 98% of the expected results.

Re-implement the criteria to once again be enforced a single time, in terms of the final result set.

rwaldron commented 6 years ago

WFM

jugglinmike commented 6 years ago

Thanks, Rick! We'll deploy this in time for the next round of Edge trials.

In the mean time, there are two result sets that satisfy the 98% criteria*, but that have been blocked because of those two problematic "chunks". I've manually triggered the upload script for those result sets. They're now available online:

Note that the reported date reflects the time of upload, not the time of the WPT commit.