FoundationDB / fdb-kubernetes-operator

A kubernetes operator for FoundationDB
Apache License 2.0
240 stars 83 forks source link

Ensure failure domains taken into account when removing Pods #646

Closed johscheuer closed 2 years ago

johscheuer commented 3 years ago

When removing Pods as part of an upgrade/replacement/shrink we should ensure before removing the Pod that we still can meet the required fault domains. The idea is to prevent the operator to remove too many Pods at once. Additionally we might want to check for the availability of the cluster and perhaps that the cluster is fully replicated.

johscheuer commented 3 years ago

For replaced Pods there shouldn't be an issue but for Pods that are removed without exclusion we might want to have an additional layer of safety to prevent accidentally shooting ourself in the foot.

johscheuer commented 2 years ago

This was implemented in https://github.com/FoundationDB/fdb-kubernetes-operator/blob/master/controllers/remove_process_groups.go#L67