FoundationDB / fdb-kubernetes-operator

A kubernetes operator for FoundationDB
Apache License 2.0
241 stars 82 forks source link

Recreate old processes as the first step #1238

Closed johscheuer closed 2 years ago

johscheuer commented 2 years ago

What would you like to be added/changed?

It can happen that a process is not bounced during an upgrade e.g. when the reboot RPC is failing between the operator and the fdbserver process. In order to mitigate this issue we could try to recreate all process groups that have an old version after the initial bounce.

johscheuer commented 2 years ago

Another approach could be to check in this loop: https://github.com/FoundationDB/fdb-kubernetes-operator/blob/main/controllers/bounce_processes.go#L60-L66 if we have processes that are not version compatible and add them directly to the addresses slices. As a safeguard wee could also check the the cluster is not in an upgrade mode anymore (expected version != running version).

johscheuer commented 2 years ago

We solved this in another issue.