FoundationDB / fdb-kubernetes-operator

A kubernetes operator for FoundationDB
Apache License 2.0
240 stars 83 forks source link

Allow to remove process groups without an address #791

Closed johscheuer closed 2 years ago

johscheuer commented 3 years ago

We should add a setting to allow the controller to remove process groups without an address. In the current design of the controller a process group without an address means that the underlying Pod was never scheduled (so it should never receive any data) that means it should be safe to remove the process group without exclusion. There is a potential race condition where we want to remove the process group without an address and during that time the underlying Pod get's scheduled. To reduce this risk we could wait for a safety period to assume the Pod will never we scheduled additionally the duration should be relatively short (a few seconds) so that only few data should be replicated to that new process group. As an further safety check we could only remove process groups without an address if the cluster is in a fully replicated state to ensure we don't remove any data. That is only an additional layer of safety since an unscheduled process group without an address has never been running.

johscheuer commented 2 years ago

Fixed in https://github.com/FoundationDB/fdb-kubernetes-operator/pull/947