Closed slavaboiko closed 6 years ago
@v-boiko It should only do that check if there are new migrations. This is necessary since schema updates in cassandra generally require all nodes to be healthy to prevent data loss (we've had issues in the past with an old node disagreeing on schema). There has already been some discussion around this, see https://github.com/sky-uk/cqlmigrate/pull/35 and https://github.com/sky-uk/cqlmigrate/pull/37.
I'm okay with adding an option to ignore health on schema migrate, but it is very much do-at-your-own-risk type of thing since it can lead to a few catastrophic situations. Better to have the whole cluster up if possible when doing schema changes - although I realise that is probably not viable for large clusters (>100 nodes).
https://github.com/sky-uk/cqlmigrate/issues/42 is also related - cqlmigrate will erroneously treat dead nodes as unhealthy.
Probably the approach is doing right now is valid. We checked and we didn't have the required migrations in the schema_updates table, so the library actually was trying to do something. Probably the best is just to sort out the cluster problems first.
Do you think we still need to acquire the lock then if the cluster is unhealthy?
It isn't necessary to acquire if unhealthy, but the same thing is accomplished by ensuring we unlock if an exception is thrown.
I believe this is fixed in https://github.com/sky-uk/cqlmigrate/pull/53.
The ClusterHealth check is considered unhealthy if some nodes are unreachable even if the configured consistencyLevel can be satisfied.
Our app can is healthy and able to serve clients, but cqlmigrate prevents it from starting if the cluster is considered unhealthy.