hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
14.76k stars 1.94k forks source link

Circular dependency namespace <-> quota during deletion #23672

Closed the-nando closed 1 month ago

the-nando commented 1 month ago

Nomad version

Nomad v1.8.2+ent
BuildDate 2024-07-16T09:08:42Z
Revision f675c14f9ea2fada1fd1e5f1f5e63ce5d07f8cd4

Issue

Before https://github.com/hashicorp/nomad/pull/23499 (tested on 1.7.6+ent) it was possible to delete a namespace with an associated quota and then proceed with the deletion of the quota itself. This is now not possible anymore and breaks setups which use Terraform to provision namespaces and quotas as the teardown now requires manual intervention.

Reproduction steps

Prep:

~ cat > quota.hcl <<EOF
name        = "ns-1"
limit {
  region = "global"
  region_limit {
    cpu        = 2500
  }
}
EOF
~ nomad quota apply quota.hcl
~ nomad namespace apply -description "My new namespace" -quota "ns-1" ns-1

Before:

~ nomad namespace delete ns-1    
Successfully deleted namespace "ns-1"!
~ nomad quota delete ns-1
Successfully deleted quota "ns-1"!
~

Now:

~ nomad namespace delete ns-1
Error deleting namespace: Unexpected response code: 500 (rpc error: 1 error occurred:
    * namespace "ns-1" has quotas associated with it: [us-east-1])
~ nomad quota delete my-quota
Error deleting quota: Unexpected response code: 500 (rpc error: Quota can't be removed since it is referenced by the following namespaces: [ns-1])
~

The only way now to proceed with the deletion is to first unset the quota:

~ nomad namespace apply -quota "" ns-1             
Successfully applied namespace "ns-1"!
~ nomad namespace delete ns-1       
Successfully deleted namespace "ns-1"!
~ nomad quota delete ns-1         
Successfully deleted quota "ns-1"!
~
jrasell commented 1 month ago

Hi @the-nando and thanks for raising this issue. I'll raise this internally to start some discussion as I believe this workflow was considered the correct approach, but we may not have considered the TF implications.

tgross commented 1 month ago

We've done a bit more investigation into this and feel pretty strongly that this is going to be best fixed in the Terraform provider itself. It's on our near-term roadmap to get fixed.