jetstack / navigator

Managed Database-as-a-Service (DBaaS) on Kubernetes
Apache License 2.0
271 stars 31 forks source link

Cassandra ScaleOut should only change the statefulset when all pods are ready #316

Closed wallrj closed 6 years ago

wallrj commented 6 years ago

Datastax documentation for Cassandra 3 implies that nodes should only be added when the cluster is in a healthy state:

Use nodetool status to verify that the node is fully bootstrapped and all other nodes are up (UN) and not in any other state.

Documentation for older versions of Cassandra is more explicit:

Warning: Simultaneously bootstrapping more than one new node from the same rack, violates LOCAL_QUORUM constraints. Data may stream from any replica in order to put data onto the new nodes, including other new nodes.

The ScaleOut code introduced in #256 was intended to only ever increase the replica count by 1, But as it stands it will be run repeatedly, incrementing the StatefulSet replicas value regardless of whether all the current pods are ready.

In practice this probably doesn't matter, since the statefulset controller will only add new pods when all current pods are ready:

The StatefulSet controller starts Pods one at a time, in order by their ordinal index. It waits until each Pod reports being Ready before starting the next one.

But in future we want to perform additional changes in the ScaleOut action, such as creating new Pilot resources, running nodetool cleanup on all the other nodes. So we don't want the statefulset controller to be adding nodes in the back ground while we're trying to perform these additional steps.

/kind bug