Storage calculator pod starts running to check the storage of a service with a persistent storage volume. e.g. solr.
A Lagoon build is triggered, causing the solr pod to be redeployed.
The solr pod is scheduled onto a different Kubernetes node than the storage calculator.
The solr pod will now be stuck in ContainerCreating because it cannot bind the RWO volume, which is already attached to the storage calculator pod.
This will cause a Lagoon build failure that the customer cannot influence. In fact, all subsequent builds will fail until one of the following occurs:
the storage calculator pod completes and exits, freeing the volume binding. This could take several hours.
the solr pod happens to be scheduled onto the same node as the storage calculator.
Here's a sceenshot of this problem occuring. The storage calculator has been running (and blocking any Lagoon builds) for over two hours at this point:
Consider this series of events:
solr
.solr
pod to be redeployed.solr
pod is scheduled onto a different Kubernetes node than the storage calculator.The
solr
pod will now be stuck inContainerCreating
because it cannot bind theRWO
volume, which is already attached to the storage calculator pod.This will cause a Lagoon build failure that the customer cannot influence. In fact, all subsequent builds will fail until one of the following occurs:
solr
pod happens to be scheduled onto the same node as the storage calculator.Here's a sceenshot of this problem occuring. The storage calculator has been running (and blocking any Lagoon builds) for over two hours at this point:![screenshot_2023-12-18-102634](https://github.com/uselagoon/storage-calculator/assets/861778/f74fc87d-e6f6-4701-9df9-70174b35803d)