Open nhuray opened 7 years ago
Hi @nhuray,
anything on this, I'm curiously looking out for rolling restarts?
Hi @ritxos,
I didn't start working on this I just discussed it long time ago: https://github.com/hashicorp/consul-template/issues/340
We are using a workaround incrementing the splay
setting in Consul Template to be sure the processes are not restarting at the same time. For sure it's not really deterministic and a locking strategy for rolling restarts is better. I might have time to work on it after Xmas.
Context
We are using Consul Template to manage dynamically the configuration of our apps.
Depending of the application we might have 3 behaviours:
nginx -s reload
)For the 2 last cases, we want to reload / restart services one by one in order to prevent service interruption.
See this issue: https://github.com/hashicorp/consul-template/issues/340
Implementation
Consul already provides a lock mechanism to create a distributed semaphore based on Consul Key: https://www.consul.io/docs/commands/lock.html
We want to implement this mechanism wrapping the complexity to configure
consul-template
andconsul lock
command in a scriptct-supervise
:Options
These options might be passed to the
ct-supervise
scriptmonitor-retry
- Retry up to this number of times if Consul returns a 500 error while monitoring the lock. This allows riding out brief periods of unavailability without causing leader elections, but increases the amount of time required to detect a lost lock in some cases. Defaults to 3, with a 1s wait between retries. Set to 0 to disable.-
-n
: Optional, limit of lock holders. Defaults to 1. The underlying implementation switches from a lock to a semaphore when increased past one. All locks on the same prefix must use the same value.-
-name
: Optional name to associate with the underlying session. If not provided, one is generated based on the child command.-
-pass-stdin
: Pass stdin to child process.-try
: Attempt to acquire the lock up to the given timeout. The timeout is a positive decimal number, with unit suffix, such as "500ms". Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h".-verbose
: Enables verbose output.The only required argument is the command to run: