hashicorp / consul

Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
https://www.consul.io
Other
28.37k stars 4.42k forks source link

Add support to consul lock to register/unregister services #1637

Open discordianfish opened 8 years ago

discordianfish commented 8 years ago

Hi,

I think it would be super useful if consul lock could register a new service when it acquires an lock and unregister it when the lock is lost. Use cases are the usual use cases where you would use locks for leader election.

slackpad commented 8 years ago

Hi @discordianfish this is an interesting idea. One nuance is that the agents are actually the source of truth for service registration, so if the servers modified the central catalog of services then we'd have to add some special-case logic to keep the agent from canceling the configuration. We do have support via EnableTagOverride for doing this, just for service tags. One feature I could see that might be simpler is to be able to say "tag service X for the lock holder with tag Y". This would let you register the service in the usual way, but add a tag like "master" to whoever has the lock.

discordianfish commented 8 years ago

@slackpad I'm not sure I understand the issue with canceling configuration. I would keep things simple: Add service to local agent if lock is acquired and remove it if lock is lost. No assumptions about how many other services might get registered on other servers. It's also not just for figuring out the leader within a number of services but also for 'singletons': I want to run only one instance of a service within my cluster and make sure if this goes down, another takes over. From a logical perspective this single instance of my service 'moves' to another host, so it doesn't make sense to have the have a tag on the active one given that all others simply don't exist (until they acquire a lock). Another thing to consider are the health checks in this case: They will only work once the service is started which requires the lock. If I already register the service and health check even though I don't have a lock, I have a service marked failed in consul..

slackpad commented 8 years ago

@discordianfish ok this makes a lot more sense and would be simpler than I thought. You'd probably want to point the lock command at some JSON to register when it gets the lock and to deregister when it starts up / attempts to get the lock.

discordianfish commented 8 years ago

@slackpad Yes exactly. I'm using this script as wrapper for now, but it's fragile and something build into consul would be awesome: https://github.com/Jodel/infra-scripts#consul-register

discordianfish commented 8 years ago

This is what I mean by fragile: https://github.com/Jodel/infra-scripts/issues/2 :) Maybe someone here has an idea, even though it's only related indirectly to this issue.

slackpad commented 8 years ago

Was thinking about this more. Here's the full feature set:

  1. You can provide an optional standby.json to consul lock and this will be registered when attempting to acquire the lock. A TTL health check will also be registered and maintained by consul lock. This lets you easily configure a standby service definition to alert on / etc. to know your standby servers are alive and attempting to get the lock. This could also be the same service definition as active.json, just with different tags.
  2. You can provide an optional active.json that is registered once the lock is acquired.
  3. consul lock will take care of registering / deregistering the standby and active services, depending on what it's doing.

We should check the json files at startup so we are less likely to fail much later on when we acquire the lock.

discordianfish commented 8 years ago

That sounds perfect! Would be ideal for my requirements and I believe other users will love it as well :)