ClusterLabs / resource-agents

Combined repository of OCF agents from the RHCS and Linux-HA projects
GNU General Public License v2.0
493 stars 583 forks source link

Galera: Allow master promotion with lower number of clones #1009

Open W1zzardTPU opened 7 years ago

W1zzardTPU commented 7 years ago

When a node is down for maintenance and the galera resource agent is checking last-commits, it will wait for ALL nodes to report status, which will never happen unless that node is taken out of maintenance.

Propose to add an optional meta variable with the number of minimum instances to be up before master selection starts, or just a flag to start master election "now", with the last-commits that have been recorded so far.

dciabrin commented 7 years ago

Unless I misread the proposal, having a flag to "start bootstrap when at least N-node have reported their state" will most probably cause data loss and I don't think it's a wise idea to have such an automatic flag.

However, a flag that could be set manually to forcefully start the bootstrap "when the user knows what he's doing ", I think we can already achieve that by following something similar to http://damien.ciabrini.name/posts/2015/10/galera-boot-process-in-open-stack-ha-and-manual-override.html . That post shows various example of manual bootstrap overrides, maybe this is okay for your needs?

W1zzardTPU commented 7 years ago

Agree that dataloss could occur when using the minimum N-node approach, didn't think it all the way through initially :)

The proposed flag should only be temporary to start the election process (the RA could remove that flag once election started).

That should be more approachable than your method, especially in case of emergency when ppl are panicking :) Great article btw, wish I had found it a year or two ago