yugabyte / yugabyte-operator

Kubernetes Operator for YugabyteDB (legacy)
65 stars 29 forks source link

[BUG] Yugabyte operator fails to achieve the correct replica number if misses an intermediate state #39

Open srteam2020 opened 3 years ago

srteam2020 commented 3 years ago

Describe the bug

We find that sometimes the operator fails to achieve the desired final replica number of TServer when some particular event is missed, which is caused by the constraint that Spec.TServer.Replicas has to be no smaller than Spec.ReplicationFactor.

Consider the following case, the cluster originally has

// state 1
Spec.ReplicationFactor = 3
Spec.TServer.Replicas = 3

And the operator will create 3 TServer replicas for yugabyte.

Later the user wants to scale up TServer and also raise the ReplicationFactor:

// state 2
Spec.ReplicationFactor = 4
Spec.TServer.Replicas = 4

However, the operator has not seen this change yet (as it is busy on the previous reconcile).

Finally, the user changes the CR spec again:

// state 3
Spec.ReplicationFactor = 4
Spec.TServer.Replicas = 3

The operator finishes the previous reconcile and sees the final state. Since the Spec.Tserver.Replicas is the same as the currently running replicas, the operator will not do any further scaling, and the cluster ends up having 3 TServer replicas.

However, if the operator does not miss anything, the cluster should end up having 4 TServer replicas. The operator will scale to 4 replicas when seeing state 2 and refuses to scale down to 3 replicas when seeing state 3 because the operator will refuse the scaling if the TServer replica count is less than cluster replication factor. yugabyte-operator has the logic in its code which prevents TServer's replica number dropping below the replication factor:

// Don't requeue if TServer replica count is less than cluster replication factor  
if cluster.Spec.Tserver.Replicas < cluster.Spec.ReplicationFactor {
    logger.Errorf("TServer replica count cannot be less than the replication factor of the cluster: '%d' < '%d'.", cluster.Spec.Tserver.Replicas, cluster.Spec.ReplicationFactor)
    return result, nil
}

Fix

We are willing to send a PR to help fix this issue. A better solution to enforce the constraint that Spec.TServer.Replicas has to be no smaller than Spec.ReplicationFactor might be to use a webhook for CR validation and admission control. If we find cluster.Spec.Tserver.Replicas < cluster.Spec.ReplicationFactor, we should block the event at the apiserver side rather than let it confuse the controller.