Closed bobholt closed 6 years ago
Looks great! Did we really have a PR that took 7 days to get started?
That's on my branch, and is because I restarted that job over and over again for a week. TravisCI re-uses job and build IDs (which we share), which will skew the percentage down in cases we have re-triggered builds. But, as we discuss, that's a good thing if builds are re-triggered because of CI errors. I'm probably going to add an enhancement issue to move to our own ids so that we can track initial builds vs rebuilds.
On the metrics question, I love that "Jobs completed in under 30 minutes" directly maps to our OKR scoring, and would like to land that first until Q3 is over. But I also think that as we improve, the number is going to creep closer to 100%, and it'll cease to be meaningful for setting goals. Could you experiment with calculating the 50th percentile (median) latency and 90th percentile latency?
@Hexcles, do you have any thoughts on how to set and score OKRs based on latency from import/export OKR planning?
That looks quite neat!
@foolip Regarding latency-based KRs, we currently use two variants in Q3:
Some thoughts:
I haven't materialized the actual numbers for import/export yet, but will propose some next week.
@Hexcles, thanks, that's very helpful. I think that the "all jobs under X min" variant does better capture what we're aiming for with both import, export and PR results, namely something highly reliable where people can count on the delay we aim for. Although it's usually going to be much faster, so it wouldn't tell people what to usually expect.
Since we're not really free to change the shape of the delay distribution however we like, a single metric that we make slightly aggressive compared to past performance is probably OK. @bobholt, WDYT?
@jgraham is OOO, @gsnedders or I will review this when done.
@bobholt, I saw you pushed some changes, is this ready for review?
I have added a bunch to this PR:
There's a lot there now. You can check it all out at https://pulls-staging.web-platform-tests.org/performance?start=2017-08-15&end=2017-09-30. This is data dumped from the production database yesterday morning, and should serve as a fairly accurate representation of what it will look like in production.
cc @foolip
This adds a performance-tracking page at https://pulls.web-platform-tests.org/performance to demonstrate that pull requests are being tested in a timely manner. This is to aid in calculating the Google OKR that PRs are tested in each browser within 30 minutes, but is generally useful information for verifying timeliness of the CI process.
Features:
today
is in the first half of a quarter.today
is in the last half of a quarter.PASSED
,FAILED
, andFINISHED
CREATED
,QUEUED
,STARTED
, andERRORED
statuses are automatically scored as 0.This also includes some cleanup of templates and config I encountered while testing.
This change is