Closed hannesm closed 1 year ago
Thank you for reporting this issue. I have made a preliminary investigation: env
contains the environment variables and is extracted from the Docker base image using docker image inspect
and saving .Config.Env
to a file. This file is missing because the worker exited with a fatal exception while the image was being extracted about half an hour earlier. Investigating that issue showed that the worker was running low on disk space and needed to prune virtually everything from the cache. The prune operation removed a cached layer, which was a dependency of a running job. The delete cascaded to the child layers, which could not be removed as it was in use, therefore causing the exception. The selection of items to be pruned is made by considering all cache layers ordered by time last used and which are older than 10 minutes. In this case, the 10-minute window was insufficient. I will look into this further tomorrow.
FWIW I had a similar failure on my repo, and the way I worked it around is by pushing and empty commit git commit -am "bump for ci" --allow-empty
(this still took advantage of most existing caching but got the broken worker out of this situation), otherwise simply restarting builds didn't help, it kept failing with same error (and yes I did notice an out of space earlier which affected both opam
CI and ocaml
CI).
I will close this issue, since there has been some commit to "ocurrent/obuilder" that may solve this issue.
As you mentioned in #858, the CI service is considered to be stable.
Now, I just observed some failure at: https://ocaml.ci.dev/github/robur-coop/albatross/commit/2f316d2e49866fe08b9e12c12194062bbbaa2329/variant/debian-12-5.0_opam-2.1 -- and I've seen similar logs before, so maybe there's a way to tackle the root cause.
Since the CI sometimes removes all the logs, I paste below the entire log from the link above:
And - as reported earlier, pasting from the Web UI is bad (it injects lots of newlines). I thought you had fixed that issue, but it looks like there's a regression.