RamenDR / ramen

Apache License 2.0
70 stars 51 forks source link

Return information only if primary VRG is found during initial deploy… #1457

Closed BenamarMk closed 3 weeks ago

BenamarMk commented 3 weeks ago

Discussed this PR with @ShyamsundarR and concluded that it is more complex and poses a risk of introducing hidden bugs. Therefore, we will leave it open for future versions. For version 4.16, we intend to implement a straightforward fix that addresses the specific instance of the issue we have identified. A comprehensive, global solution will be considered for later updates.

When the primary cluster is down and the workload is in the initial deployment that is targeted for volsync, the DRPC might mistakenly think the VRG on the secondary cluster is the primary one. This code hasn't changed since VolSync was introduced. The original code expected only one primary VRG between the two clusters. The fix is straightforward: for the initial deployment, only return the cluster if the primary VRG is found; otherwise, return not found.

ShyamsundarR commented 3 weeks ago

Tested older behavior in drenv with CephFS/Volsync backed volumes, and post deploy if preferred cluster is down and MCV reports errors the rdspec is cleared out in Secondary (had to manually edit MCV status to report false, as upstream MCV status does not change if cluster is down).

Patched and repeated tests as above, and rdspec is retained.