Closed DanielRyanSmith closed 11 months ago
Azure Pipelines seems to have been having problems; the test jobs have been often failing with the test agents going away or stopping responding. Not much we can do here.
Adds a heads-up, this is still a problem and no stable aligned runs (test results on the same hash for Chrome, Edge, Firefox, & Safari) have been produced since May 13th.
The most recent run is now from May 23.
Edge is failing too, I've filed https://github.com/web-platform-tests/wpt/issues/40300. But that Edge failing shouldn't affect Safari results being uploaded and vice versa.
Edge is fixed, but Safari is still broken to the point that it's effectively breaking Interop scoring (we're getting maybe one update a week). I've submitted https://github.com/web-platform-tests/wpt/pull/40362 to see if retries work/help, but in any case we need to address the underlying problem. Do we have contacts on the Azure side who could investigate?
We're getting a lot of:
[error]We stopped hearing from agent Azure Pipelines 10. Verify the agent machine is running and has a healthy network connection. Anything that terminates an agent process, starves it for CPU, or blocks its network access can cause this error. For more information, see: https://go.microsoft.com/fwlink/?linkid=846610
…and there's nothing to suggest that somehow we're actually stopping the agent from running, or somehow ending up with Safari spinning or something. So theoretically something could've changed such that we're now starving the agent process of CPU, but it seems unlikely for that to have suddenly started?
@mustjab can you help with a contact on the Azure Pipelines team if @gsnedders needs it for debugging this issue?
Best way to start would be to file an issue with azure-pipelines-agent team: https://github.com/microsoft/azure-pipelines-agent/issues/new/choose. I've looked through their open issues and didn't find any recent Mac issues, but this might be a related one: https://github.com/microsoft/azure-pipelines-agent/issues/3994
Thanks @mustjab!
I've filed a new bug on Azure. Please let me know if I got any of the details wrong, or missed something important.
@jgraham Looks like they've asked us to open issues on agent team instead, did you get a chance to file it?
https://github.com/microsoft/azure-pipelines-agent/issues/4313
@mustjab do you have any more context on the ongoing investigation that you can share?
Checked with the agent team and they haven't looked at your issue yet, but they mentioned that there was a similar report that they already resolved, are you still seeing this happen in WPT runs?
It is still happening, here is one case from today.
I filed actions/runner-images#7754
From there, the internal issue has been resolved, and it seems like things have been much better over the last few days (comparable to where we were on the macos-12
images).
it's gone back to being less reliable, but as mentioned in the other issue:
fix is to be delivered around mid-august (reason for being better right now is not very clear)
The fix seems to be deployed and working reasonably well now, shall we close this?
I've alas been still kicking them manually a whole bunch, not sure it is working that great. But was planning on trying to follow up sometime soon.
See this filtered view of builds—there's still a fair bit of red (and white!) there, even since the new images went live.
@mustjab any further updates on the effort to fix this? The link from Sam's comment above still shows frequent failures.
@past A fair percentage of the failures are Edge, or caused by macOS bugs. https://wpt.fyi/runs?label=master&label=experimental&aligned and https://wpt.fyi/runs?label=master&label=stable&aligned both show plenty of recent aligned runs, so I'm not too concerned at this point. I vote we close this?
Sounds good to me, we can open a new one if needed.
No stable Safari results have been available since May 15th. wpt.fyi run status is not showing any recent invalid Safari runs. Not quite sure of the reason here. Creating this issue for visibility.
CC @gsnedders