Open lewismarshall opened 6 years ago
The uuid represents the cluster uuid at the time of the crash, so it must be the same on every node in the cluster. You would need to manually edit the grastate.dat
, like you would for any other Galera Cluster.
One issue I see is that our wrapper script only allows bootstrapping from the first node (mysql-0
) so if the other nodes have still accepted write queries before the crash you would be loosing this data. You would need to add wsrep_recover=1
(explained here) to the configuration to check the GTID on every node.
To get the cluster up and running set safe_to_bootstrap: 1
in the grastate.dat
in the PV of mysql-0
and the cluster should come up again.
@tongpu thanks for the uuid note. I've recovered the cluster manually (too many times in a test cluster) as you descibed.
This issue is about the operationally preferable option for a self healing cluster and it seems as though it's something galera supports with the grastate.dat but it always seems to contain -1 for the seqno. I'm adding some debugging and doing some more tests to see if this is always the case - the nodes do always seem to shutdown clean when pods deleted (or when physical nodes) are shut down.
Looking at the variouse galara docs here it seems as though setting wsrep_provider_options="gcache.recover=yes"
maybe required and also the additional option pc.recovery=TRUE
as described here may also be required (although it does look like a default). I'll do some testing...
After debugging, the grastate.dat
typically has 0 for the sequence number but if the pods are deleted, the nodes fail to move off of the first node. Maybe health checks should be sensitive to recovery operations so all nodes can come up. Will have a think over the weekend.
Normally a Galera Cluster automatically recovers from a power loss scenario. But only if all the nodes come back up at nearly the same time, otherwise you have to find the most current one using wsrep_recover=1
. Because of the way the StatefulSet API starts pods this will probably never happen in this situation, because we're always going to try to bootstrap from the first node.
If all nodes are ever stopped e.g. from a power failure in a DC (an operational tested scenario), the cluster will fail to start.
It would seem as though the
/var/lib/mysql/grastate.dat
file never has suitable information for an automated recovery (as documented on galeracluster.com).This results in the following symptoms:
2018-05-25 13:29:33 140151997229312 [Warning] WSREP: no nodes coming from prim view, prim not possible
And then fail to connect (the same on all nodes):
All the grastate.dat files seem equivalent so any should/could potentially be startable:
A bit worried the UUID seems to be the same for all nodes...
See the complete logs here: mysql-2.log mysql-1.log mysql-0.log