nerc-project / operations

Issues related to the operation of the NERC OpenShift environment
2 stars 0 forks source link

Buildconfigs are failing on OCP prod #816

Closed Milstein closed 1 week ago

Milstein commented 1 week ago

Hello,

I have several buildconfigs that were previously working but now are failing on the final push step with a 500 error.

Here's an example:

https://console.apps.shift.nerc.mghpcc.org/k8s/ns/project-robbie-6f75ac/builds/robbie-build-config-1.0.0-gpu-py3.10-torch2.2-ubuntu22.04-beta-6/logs

Please advise

jtriley commented 1 week ago

Looks like the image-registry volume is full. It's currently at 300GiB. We either need to resize or ask folks to clean up images. Do they have old images they can remove to free up space in the meantime?

Milstein commented 1 week ago

@jtriley :

I deleted everything we weren't using and it is still failing.

  • Matt
jtriley commented 1 week ago

Just noting @waygil would like to bump the image registry storage to 750GiB for now.