Provide an API for listing a series of known good runs

foolip commented 1 year ago

Analysis done outside of wpt.fyi using runs from /api/runs usually ends up needing to do some filtering of the returned runs to get sensible results. For example, https://github.com/web-platform-tests/results-analysis/blob/main/bad-ranges.js filters out ranges of known bad ranges.

In code that I've written in the past, I've also needed to handle time periods where the browser version was flip-flopping between N and N+1, either because two configurations were running at the same time, or because there were regressions leading to pinning of the browser version, and later unpinning.

Here are the guarantees that I think wpt.fyi could usefully provide for a series of runs for a single browser:

Known bad runs are excluded
The start time of the run is monotonically increasing
The browser version is monotonically increasing (given flip-flopping, ideally it would pick the breakpoint that gives the shortest gap between runs, or perhaps the most number of total runs)
The OS version is monotonically increasing (same considerations as above)
The browser channel only "increases" (in case we switch the default Chrome config from dev to canary)

This does not necessarily need to be a single API or new parameter for /api/runs, it might be several.

It would furthermore be useful to be able to get multiple series of runs together, and aligned being respected.

This is always the first problem I have to solve when doing any kind of time series analysis, so it would be great to solve it in one place 😄

foolip commented 1 year ago

Just came across https://github.com/web-platform-tests/results-analysis/pull/186, which is another difficulty with aligned series of runs which could possibly be handled in a wpt.fyi API instead. I think I'll add one more guarantee that I think would simplify the problem of hash-aligned but date-misaligned runs:

The WPT commit date is monotonically increasing (we don't have this information in wpt.fyi now)

This way, wether aligned is used or multiple series are fetched and aligned by the client, there are a few simple options that are less heuristic-y than https://github.com/web-platform-tests/results-analysis/pull/186:

Just use the commit date
Use the earliest start time of the runs
Use the latest start time of the runs

cc @gsnedders

gsnedders commented 10 months ago

The OS version is monotonically increasing (same considerations as above)

This is not necessarily something we want to strictly guarantee; it should definitely broadly be true, but we have previously reverted OS upgrades.

Plus one could imagine running roughly the same configuration in different CI systems (as we previously had with the Bocoup-maintained Buildbot and Azure Pipelines for macOS), at different frequencies, which may alter selection.

foolip commented 10 months ago

The OS version is monotonically increasing (same considerations as above)

This is not necessarily something we want to strictly guarantee; it should definitely broadly be true, but we have previously reverted OS upgrades.

Plus one could imagine running roughly the same configuration in different CI systems (as we previously had with the Bocoup-maintained Buildbot and Azure Pipelines for macOS), at different frequencies, which may alter selection.

These are the cases I had in mind that an API should handle. The version bump should happen once without flip-flopping, by filtering out some runs. The logic for which runs to filter out is an interesting question without an obvious best answer. Perhaps multiple strategies are valid.

web-platform-tests / wpt.fyi

Provide an API for listing a series of known good runs #3508