On our kubernetes clusters we have many namespaces one for each customer. We create a postgres-operator in each namespace for compliance reasons, to keep each deployment completely separate. To remove a customer, the typical action is to just delete the entire namespace. The problem though, is that the postgres-operator exits immediately, and leaves all of the postgres CRs stuck in terminating state because they still contain the finalizer.db.movetokube.com finalizer.
Proposal:
The controller runtime has a GracefulShutdownTimeout option that is not set in this operator (see code here). This likely means it defaults to time.Duration(0), which according to the docs means that the graceful shutdown is ignored. It is likely that if we set this to a reasonable value (say 30 seconds?) that the operator will have some time to notice that terminating CRs and cleanup after itself.
On our kubernetes clusters we have many namespaces one for each customer. We create a postgres-operator in each namespace for compliance reasons, to keep each deployment completely separate. To remove a customer, the typical action is to just delete the entire namespace. The problem though, is that the postgres-operator exits immediately, and leaves all of the postgres CRs stuck in terminating state because they still contain the
finalizer.db.movetokube.com
finalizer.Proposal:
The controller runtime has a GracefulShutdownTimeout option that is not set in this operator (see code here). This likely means it defaults to
time.Duration(0)
, which according to the docs means that the graceful shutdown is ignored. It is likely that if we set this to a reasonable value (say 30 seconds?) that the operator will have some time to notice that terminating CRs and cleanup after itself.