Closed DhanshreeA closed 8 months ago
Hi @DhanshreeA I am highlighting this model: https://github.com/ersilia-os/eos9taz/actions as I need it for chemsampler but it is not passing either :)
@GemmaTuron, I'm aware of this (https://github.com/ersilia-os/eos9taz/issues/11) For now, I've pushed it manually to unblock you. However there's a related issue that I'm on top of: #1068 which I opened with specifically this model in mind.
Root Cause: With GitHub hosted runners, one of the guarantees is getting software updates. This could mean one or all of the following frequently: runner updates, runner provisioner updates, patches to the OS on the runner, the software bundled in the OS, etc. Every such update is quite likely to eat into the disk space. For example, here's the list of all the tools installed on the runner OS, that we do not need for our builds. So far there is no straightforward solution other than an aggressive disk clean up. This has been implemented at the level of the eos-template repository. While this works for now, it is not guaranteed that this issue will not come up again.
Describe the bug.
A number of our model pipelines, typically in the "upload model to dockerhub" stage fail because of "No space left on device". Exhibits:
Describe the steps to reproduce the behavior
Go to any one of the jobs above and re run. I tried to run these jobs when no other jobs were running on our runners and yet they failed.
Expected behavior.
These jobs should pass.
Screenshots.
No response
Operating environment
Runner OS
Additional context
Potentially useful resources: