The persistent runner is out of disk space:
docker: failed to register layer: write /var/cuda-repo-ubuntu2004-12-2-local/nsight-systems-2023.2.3_2023.2.3.1001-1_amd64.deb: no space left on device.
Ideas:
Check the docker image size before and after this change
Check the docker image cache on the runner
Clear the docker image cache on the runner and hope multiple jobs don't redownload different images
Increase the disk size
See what else is taking up space on the runner (old run artifacts? install directory? benchmark model weights etc.?)
The postsubmit
test_nvidia_a100
job started failing after this was merged: https://github.com/iree-org/iree/actions/runs/9493590493/job/26189543324#step:8:60The persistent runner is out of disk space:
docker: failed to register layer: write /var/cuda-repo-ubuntu2004-12-2-local/nsight-systems-2023.2.3_2023.2.3.1001-1_amd64.deb: no space left on device.
Ideas: