thelastpickle / cassandra-medusa

Apache Cassandra Backup and Restore Tool
Apache License 2.0
266 stars 143 forks source link

discrepancy in backup status #741

Closed zencircle closed 5 months ago

zencircle commented 7 months ago

Project board link

kubectl -n k8ssandra-operator get medusabackups.medusa.k8ssandra.io | grep -e NAME -e  IN_PROGRESS
NAME                            STARTED   FINISHED   NODES   COMPLETED   STATUS
cluster1-main-1712584800        31h       31h        6       6           IN_PROGRESS
cluster1-main-1712588400        30h       30h        6       6           IN_PROGRESS

Status from the CRDs show in progress

for mjob in $(kubectl -n k8ssandra-operator get medusabackups.medusa.k8ssandra.io | grep -e  IN_PROGRESS | grep -v full | awk '{print $1}'); do echo "---- $mjob -----" && kubectl  -n k8ssandra-operator get medusabackups.medusa.k8ssandra.io $mjob -oyaml | grep -e finishedNodes -e finishTime -e status -e start -e  startTime ; done
---- cluster1-main-1712584800 -----
status:
  finishTime: "2024-04-08T14:04:21Z"
  finishedNodes: 6
  startTime: "2024-04-08T14:00:15Z"
  status: IN_PROGRESS
---- cluster1-main-1712588400 -----
status:
  finishTime: "2024-04-08T15:04:26Z"
  finishedNodes: 6
  startTime: "2024-04-08T15:00:15Z"
  status: IN_PROGRESS

Status from the medusa deployment pods show its complete

for mjob in $(kubectl -n k8ssandra-operator get medusabackups.medusa.k8ssandra.io | grep -e  IN_PROGRESS | grep -v full | awk '{print $1}'); do echo "---- $mjob -----" && kubectl -n k8ssandra-operator exec -it  $(kubectl -n k8ssandra-operator get pods -l "app=cluster1-main-medusa-standalone" -o jsonpath='{.items[*].metadata.name}') -- medusa status --backup-name=$mjob 2> /dev/null ; done
---- cluster1-main-1712584800 -----
cluster1-main-1712584800
- Started: 2024-04-08 14:00:16, Finished: 2024-04-08 14:01:02
- 6 nodes completed, 0 nodes incomplete, 0 nodes missing
- 17456 files, 680.02 GB
---- cluster1-main-1712588400 -----
cluster1-main-1712588400
- Started: 2024-04-08 15:00:16, Finished: 2024-04-08 15:01:06
- 6 nodes completed, 0 nodes incomplete, 0 nodes missing
- 17264 files, 680.02 GB
helm -n k8ssandra-operator list
NAME                NAMESPACE           REVISION    UPDATED                                 STATUS      CHART                       APP VERSION
k8ssandra-operator  k8ssandra-operator  6           2023-11-21 21:16:34.937132 -0500 EST    deployed    k8ssandra-operator-1.10.2   1.10.2  
kubernetes joshib$ kubectl -n k8ssandra-operator exec -it  $(kubectl -n k8ssandra-operator get pods -l "app=cluster1-main-medusa-standalone" -o jsonpath='{.items[*].metadata.name}') -- medusa --version
0.16.3
rzvoncek commented 6 months ago

Hi @zencircle! Thanks for the report! I particularly appreciate including the commands you used to get the information.

I'll go for the low hanging fruit here and ask about trying this with the recent Medusa version (0.21.0), and perhaps even the k8ssandra-operator. We've done some changes to the backup status reporting because it was indeed buggy.

I'm sorry for pinging it back to you this way, but it might, in fact, be the correct way forward.

zencircle commented 6 months ago

I reduced the frequency of the backup jobs and upgraded to 0.20.1 and not seeing issue any more

rzvoncek commented 5 months ago

Excellent to hear. I'll go ahead and close the issue. Please feel free to open new ones if you need more help.