web-platform-tests / results-analysis

Metrics generation for wpt.fyi
Other
20 stars 12 forks source link

Interop score for interop-2023-motion [stable] seems nonsense #201

Open gsnedders opened 11 months ago

gsnedders commented 11 months ago

Currently the Motion Path [stable] graph looks like:

Screenshot 2023-11-15 at 11 50 18

At the beginning of the year, the Interop score was higher than the Firefox or Chrome score, which seems impossible—54.7% of tests cannot pass in all three browsers if one browser only passes 44.3%.

DanielRyanSmith commented 9 months ago

I just did some investigating on this. It is caused by the number of tests in the focus area fluctuating over time (presumably adding new tests later in the year). The experimental chart shows the same strange interop score, which corrects itself sometime in May. Before that time, the focus area was scored on 73 individual tests. And after May, the focus area moved to using 93 tests, which is nearly the number that is used today.

The likely way to circumvent this is to calculate the interop scores at the end of the entire yearly calculation, averaging by the number of tests present in the final runs, rather than the tests present in the runs of that day.