RamenDR / ramen

Apache License 2.0
73 stars 53 forks source link

Failover of cephfs discovered app workload fails #1429

Closed ELENAGER closed 3 months ago

ELENAGER commented 4 months ago

The problem is happening only when we are using discovered apps. After deploying the discovered app workload, failover was started and failed. VRG is reporting the following error:

failed to add owner/label to snapshot busybox-pvc-16-20240528120125 (cross-namespace owner references are disallowed, owner's namespace openshift-dr-ops, obj's namespace app-busybox-cephfs-1)

The solution is to add owneref only when the VRG is not in the admin namespace (in our case VRG is in openshift-dr-ops namespace, which is admin namespace, so setting owner must be skipped).

We are still adding the label do-not-delete to snapshot though

Fixed: https://bugzilla.redhat.com/show_bug.cgi?id=2283633

nirs commented 4 months ago

The problem is happening only when we are using discovered apps. After deploying the discovered app workload, failover was started and failed. VRG is reporting the following error:

failed to add owner/label to snapshot busybox-pvc-16-20240528120125 (cross-namespace owner references are disallowed, owner's namespace openshift-dr-ops, obj's namespace app-busybox-cephfs-1)

The solution is to add owneref only when the VRG is not in the admin namespace (in our case VRG is in openshift-dr-ops namespace, which is admin namespace, so setting owner must be skipped).

We are still adding the label do-not-delete to snapshot though

Fixed: https://bugzilla.redhat.com/show_bug.cgi?id=2283633

Thanks! this makes the problem clear. We cannot use ownerRef when the VRG and the owned resources are not in the same.

But this must be in the commit message - not in the PR message which is not part of git history.

The context must be accessible via git log. With the current codef, the only info we have is:

commit 918f13e55c6a18ffa8029d7b1a7f700381cc3fee (HEAD -> bug_2283633)
Author: Elena Gershkovich <elenage@il.ibm.com>
Date:   Wed Jun 5 13:43:12 2024 +0300

    Clean up all RD, RS and snapshots, owned by VRG upon deletion

    Signed-off-by: Elena Gershkovich <elenage@il.ibm.com>

commit 6ae3e6ed4dae27e23a1258466553f1bbd7cfd4ca
Author: Elena Gershkovich <elenage@il.ibm.com>
Date:   Thu May 30 09:34:45 2024 +0300

    Failover of cephfs discovered apps workload fails

    Signed-off-by: Elena Gershkovich <elenage@il.ibm.com>

This is not helpful for maintaining this code in the future.

ELENAGER commented 4 months ago

@nirs I changed commit message to: "Skip setting OwnerRef to VolumeSnapshot when VRG and VolumeSnapshot are not in the same namespace"