Closed d-netto closed 2 months ago
We've had that for months now
Any updates on this?
Saw it happening again on https://buildkite.com/julialang/julia-master/builds/37743#019056c0-be81-462a-8e83-bce634b93f28.
IIRC, @staticfloat and others have spent a lot of time looking into this, and so far we still don't know what the underlying problem is.
In the short-term, the workaround is likely going to be to just manually retry that job when it fails.
Thanks for the clarification.
Another workaround that I think would be nice to implement:
If a Windows job fails, and the runtime of the job was <= 60 seconds, automatically retry the job, up to a maximum of N
total tries (for a reasonable value of N
). However, if a Windows job fails, and the runtime of the job was > 60 seconds, then don't retry the job.
The hard part (the part that I don't know how to implement) is to gate the auto retry on the job duration. Because we don't want to unconditionally retry all failed Windows jobs, just the short ones.
I don't know where this was written down, but the next step on this issue was to run peflags -v bash.exe
on the .exe file in our windows images and see if high-entropy-va is set.
Ah, we did look into it. Should have been fixed by https://github.com/JuliaCI/rootfs-images/pull/250.
We still have more intermittent windows issues, but let's open new issues for those to segragate failure logs after that change.
Saw this on https://buildkite.com/julialang/julia-master/builds/35570#018ee359-5c7a-43fc-8309-a24a571e8a38 and https://buildkite.com/julialang/julia-master/builds/35570#018ee395-c4e2-4153-9db3-9fe9822e3bff.
Not sure if it's transient.