RamenDR / ramen

Apache License 2.0
70 stars 51 forks source link

Disable DR during relocate lose application data #1473

Open nirs opened 1 week ago

nirs commented 1 week ago

If we disable DR during relocate after ramen set an empty placementdecision, the application is not running after DR is disabled, and the application data is gone.

How to reproduce

  1. Start the deployment-rbd (from ocm-ramen-samples) on cluster dr2
  2. Enable DR
  3. Wait for the first replication to be reported (non empty lastGroupSyncTime)
  4. Start watching placemnetdecisions:
    watch -n 1 -x kubectl get placementdecisions placement-decision-1 -o jsonpath='{.status.decisions}{"\n"}' -n deployment-rbd --context hub
  5. Start watching drpc status
    $ kubectl get drpc -A -o wide -w --context hub
  6. Start relocate
  7. When placement decision become empty, delete the drpc
    $ kubectl delete drpc deployment-rbd-drpc -n deployment-rbd --context hub
    drplacementcontrol.ramendr.openshift.io "deployment-rbd-drpc" deleted

Actual result

drpc status:

$ kubectl get drpc -A -o wide -w --context hub
NAMESPACE        NAME                  AGE   PREFERREDCLUSTER   FAILOVERCLUSTER   DESIREDSTATE   CURRENTSTATE   PROGRESSION   START TIME             DURATION       PEER READY
deployment-rbd   deployment-rbd-drpc   33s   dr2                                                 Deployed       Completed     2024-06-25T11:51:04Z   4.036442809s   True
deployment-rbd   deployment-rbd-drpc   91s   dr2                                                 Deployed       Completed     2024-06-25T11:51:04Z   4.036442809s   True
deployment-rbd   deployment-rbd-drpc   2m31s   dr2                                                 Deployed       Completed     2024-06-25T11:51:04Z   4.036442809s   True
deployment-rbd   deployment-rbd-drpc   2m38s   dr1                                  Relocate       Deployed       Completed     2024-06-25T11:51:04Z   4.036442809s   True
deployment-rbd   deployment-rbd-drpc   2m38s   dr1                                  Relocate       Initiating     PreparingFinalSync   2024-06-25T11:53:15Z                  True
deployment-rbd   deployment-rbd-drpc   3m1s    dr1                                  Relocate       Relocating     RunningFinalSync     2024-06-25T11:53:15Z                  True
deployment-rbd   deployment-rbd-drpc   3m4s    dr1                                  Relocate       Relocating     RunningFinalSync     2024-06-25T11:53:15Z                  True
deployment-rbd   deployment-rbd-drpc   3m4s    dr1                                  Relocate       Deleting       Deleting             2024-06-25T11:53:15Z                  True
deployment-rbd   deployment-rbd-drpc   3m31s   dr1                                  Relocate       Deleting       Deleting             2024-06-25T11:53:15Z                  True

pvc,pvs:

$ kubectl get pv,pvc -n deployment-rbd --context dr1
NAME                                                        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                     STORAGECLASS   VOLUMEATTRIBUTESCLASS   REASON   AGE
persistentvolume/pvc-63fe160e-79b1-41aa-8d6b-efa9c6fc2cb0   10Gi       RWO            Delete           Bound    minio/minio-storage-pvc   standard       <unset>                          19h

$ kubectl get pv,pvc -n deployment-rbd --context dr2
NAME                                                        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                     STORAGECLASS   VOLUMEATTRIBUTESCLASS   REASON   AGE
persistentvolume/pvc-665c8170-6b56-45a3-a8fc-1b50170f0d6d   10Gi       RWO            Delete           Bound    minio/minio-storage-pvc   standard       <unset>                          19h

Expected result

Logs

disbale-dr-during-relocate.tar.gz