web-platform-tests / results-collection

Other
41 stars 46 forks source link

No Edge or Safari stable results in >24 hours #607

Closed foolip closed 6 years ago

foolip commented 6 years ago

https://wpt.fyi/test-runs has not seem results for Edge or Safari since 2018-09-24T01:15:29.790252Z and 2018-09-24T03:17:40.904069Z respective. That is now >24 hours ago, while the full runs take something <12 hours, so something is probably wrong.

@jugglinmike @mariestaver FYI

jugglinmike commented 6 years ago

Results are now available for 2018-09-24 and without my intervention. They delay may have been caused by gh-605, which I only took steps to resolve yesterday afternoon.

To be honest, though, Sauce Labs builds are so irregular that I don't have a good sense for when they typically complete. I wrote a script (included below) to get some data:

$ curl 'https://wpt.fyi/api/runs?product=edge&product=safari&labels=buildbot,stable&max-count=20' --silent | python experiments/2018-09-25-timing.py 
edge safari
2018-09-05T04:11:57.221838Z missing
2018-09-06T06:54:56.079204Z missing
2018-09-07T10:08:51.028023Z 2018-09-07T09:23:52.450055Z
2018-09-08T09:41:39.443402Z 2018-09-08T02:47:46.253521Z
2018-09-09T13:25:55.224959Z 2018-09-09T18:23:05.655249Z
2018-09-10T08:45:00.339778Z 2018-09-10T09:42:00.292008Z
2018-09-11T14:46:50.280199Z 2018-09-11T09:07:52.436175Z
missing 2018-09-12T10:56:38.05937Z
2018-09-13T06:39:05.531556Z 2018-09-13T09:20:22.159312Z
2018-09-14T07:38:54.254088Z 2018-09-14T10:06:03.158894Z
2018-09-15T06:49:07.095779Z 2018-09-15T10:29:01.744232Z
2018-09-16T03:49:12.577577Z 2018-09-16T06:39:20.445775Z
2018-09-17T03:15:25.279373Z 2018-09-17T04:56:38.038864Z
2018-09-18T06:24:24.505601Z 2018-09-18T09:53:21.336091Z
2018-09-19T06:08:01.782948Z 2018-09-19T09:34:15.331304Z
2018-09-20T04:56:24.9721Z 2018-09-20T08:10:08.294358Z
2018-09-21T15:07:03.912029Z 2018-09-21T10:01:47.027585Z
2018-09-22T02:39:05.054698Z 2018-09-22T04:44:20.707869Z
2018-09-23T01:17:10.551378Z 2018-09-23T03:20:25.135589Z
2018-09-24T01:15:29.790252Z 2018-09-24T03:17:40.904069Z
2018-09-25T15:43:57.817489Z 2018-09-25T13:16:28.18537Z

I was surprised by the amount of variation until I remembered that prior to the resolution of gh-602, some Firefox builds were taking over 6 hours to complete. That's relevant due to the way workers are scheduled.

Workers may only make one collection attempt at a time. A worker that is actively testing Firefox may be assigned a Sauce Labs job, though. In that case, the job will be deferred until the worker is ready, even if other workers become available in the mean time.

If a worker takes 6 hours to complete a Firefox build, then any Sauce Labs build assigned to it will be severely delayed. We can't upload until all the jobs are complete, so this delay is apparent in the result's created_at date.

This is clearly a deficiency of the scheduling algorithm: I didn't consider the possibility of 6-hour jobs when building it. However, this is a failure condition. We want the system to be robust, of course, but in terms of current priorities, it's probably best for now to fix the underlying issue. I've filed gh-608 to track the enhancement.

2018-09-25-timing.py ```py import json import sys from dateutil import parser runs = json.loads(sys.stdin.read()) def compare(a, b): a_created = parser.parse(a['created_at']) b_created = parser.parse(b['created_at']) return cmp(a_created, b_created) by_date = {} for run in runs: date_str = run['created_at'].split('T')[0] if not date_str in by_date: by_date[date_str] = {} by_date[date_str][run['browser_name']] = run['created_at'] grouped = [by_date[date_str] for date_str in sorted(by_date.keys())] print 'edge | safari' print '-----|-------' for runs in grouped: print '%s | %s' % (runs.get('edge', 'missing'), runs.get('safari', 'missing')) ```