Open rzvoncek opened 5 months ago
Issues
1 New issue
0 Accepted issues
Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code
Issues
1 New issue
0 Accepted issues
Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code
Fixes https://github.com/k8ssandra/k8ssandra-operator/issues/1312.
This PR alters how the BackupMan reports a status of the backup. Previously, it'd just look at the storage, and report the state of the backup as IN_PROGRESS if the backup has started (~wrote something into the storage). It'd not care if the backup is actually ongoing.
With this change, it will also check if there is an active future waiting for the backup to finish. The future's presence indicates a healthy state, which in turn means the backup has a high chance of happening.
However, waiting for the future to complete and run the associated callback is not a requirement for the backup to complete. When I tried to cover this in the integration steps, I was able to restart the gRPC server, but I was unable to kill the actual process that does the backup. So the new server (with new BackupMan) came back and saw the backup as complete.
The benefit of this change is more visible in the world of k8ssandra-operator, where a pod might restart mid-backup. Killing a pod. during a backup, would lead to the following situation:
The
backup-3
would never finish, it'd continue to be reported asIN_PROGRESS
. This is because theMedusaBackupJob
would always feature:Deploying a Medusa container built off this branch actually heals the situation:
Because: