hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
14.8k stars 1.94k forks source link

Nomad-Consul sync for background SI token reconciliation #6719

Open shoenig opened 4 years ago

shoenig commented 4 years ago

As part of #6701 Nomad Servers will need to be able to perform background Service Identity token reconciliation.

There is an inherent race condition where a Nomad Server successfully requests a new SI token from Consul, but then crashes / loses leadership before it's able to persist the details of that token into the raft log. This isn't particularly bad for Nomad, which can simply retry the request later on with a new leader, but it could cause orphaned tokens to accumulate in Consul. Since SI tokens are not periodic and will have no TTL, such tokens would linger forever. To avoid that, Nomad Servers can periodically request from Consul a list of every token it knows about, and compare that with every SI token a Nomad cluster knows about. If there are any SI tokens in Consul generated for the Nomad cluster (compared with metadata stored in the Description field), that Nomad is not aware of, request Consul to revoke them.

tgross commented 10 months ago

Starting in Nomad 1.7.0-beta.1 we've deprecated the workflow where the servers mint SI tokens and need highly-privileged Consul tokens. That workflow will be removed in Nomad 1.9.

Ideally by that point we'll have some Consul-side sync (as Consul has for k8s). I'm going to keep this issue open in the meantime but rename it slightly.