Closed adejanovski closed 2 years ago
➤ Jeff DiNoto commented:
Open question to be looked at: Can we restore from a backup that gets into this state?
@jdonenine yes we can restore from a backup that gets into the state. The restore controller should be up front validation, but it currently does not do anything like checking that the backup has completed successfully.
Issue moved to k8ssandra/k8ssandra-operator #633 via ZenHub
Some backups never get marked as finished for unclear reasons. Looking at the code, it appears that the
doBackup()
gRPC call is a blocking one running in a goroutine. Some backups can last for many hours, making it unreliable to rely on blocking http calls. Even running the backup and checking the status of the backup in the storage bucket (usingmedusa status
for example), would not be reliable as it would detect successful backups but not failed ones (which look like running ones to Medusa). Instead, we'd need to make thedoBackup()
call a short operation which starts a thread running the actual backup. Another gRPC operation should be created to check the state of the thread, allowing to monitor the backup operation in an async fashion. The Medusa parts of this are captured in this issue.┆Issue is synchronized with this Jira Bug by Unito ┆Affected Versions: k8ssandra-1.2.0,k8ssandra-1.3.0 ┆Epic: Remote Cluster Restore ┆Issue Number: K8SSAND-624 ┆Priority: Medium