hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
14.94k stars 1.96k forks source link

Feature: Manually re-render templates #16271

Open EtienneBruines opened 1 year ago

EtienneBruines commented 1 year ago

Proposal

A way to re-render templates for a single allocation, probably using the API.

Use-cases

Cases that use HashiCorp Vault with the kv2 secret engine while still needing quick updates. The current interval at which consul-template re-renders those Vault kv2 secrets is too low. (My specific use-case is adding new certificates (which are stored in the Vault) to haproxy.)

We can set up a service that listens for update-messages (e.g. listening on RabbitMQ or AWS SNS) and then at some point notify Nomad somehow to re-render the templates for some allocation (or for the "current" allocation, since I'd be deploying it as a sidecar).

The end-goal would be to have the Nomad templates re-rendered on-demand.

Attempted Solutions

The current "workaround" is waiting up to 5 minutes for consul-template (as compiled with Nomad) to automatically re-fetch those values. This delays the deployment of new certificates (e.g. for new domains) by up to 5 minutes, without knowing exactly when the instances are updated and ready-to-go.

Another workaround would be to use Vault kv1 instead of Vault kv2, there we can configure some lease duration to speed things up.

Another workaround is using a haproxy-specific Runtime API to dynamically update the certificates, but I'd much rather stick to using Nomad templates for simplicity sake.

tgross commented 1 year ago

Hi @EtienneBruines! The current behavior is a limitation of the Vault API, which doesn't support the same kind of blocking queries that Consul (or Nomad) does. So consul-template has to poll the Vault endpoint. There's definitely some related problems here with https://github.com/hashicorp/nomad/issues/10920 where the idea is to coordinate updates across a job. I'll mark this idea for roadmapping.