autopilotpattern / mysql

Implementation of the autopilot pattern for MySQL
Mozilla Public License 2.0
172 stars 68 forks source link

Re-configure after a consul system failure. #69

Closed mbbender closed 7 years ago

mbbender commented 7 years ago

I was playing around with this and the consul autopilot pattern. I removed consul from this docker-compose thinking it would use the consul cluster I setup. I then removed the consul cluster and re-created it. After the recreation mysql never reported back to consul (which was available at the same CNS address) to set back up it's primary/slaves/keys etc... Maybe this is unreasonable, I'm honestly not sure how it all works together yet.

tgross commented 7 years ago

@mbbender when you removed the Consul cluster all the data that the management hooks would use to coordinate was destroyed. All the nodes will have been initialized at that point, so they won't try to re-configure themselves as primaries and there won't be any onChange events for them to watch.

I'm fairly certain this is the desired behavior we want: it'd be very easy to create a split-brain situation if we allowed the instances to reconfigure to a fresh Consul instance. If you want to tear down Consul you'll want to be coming back from a snapshot of the Consul cluster so that the data is preserved (I'll be honest and say I've never done this myself with Consul 😊 so I'm not sure how feasible that is).