RamenDR / ramen

Apache License 2.0
70 stars 51 forks source link

Upgrade olm to latest working version #1411

Closed nirs closed 1 month ago

nirs commented 1 month ago

We upgraded olm to 0.27 in #1260 but this broke ramen and reverted back to 0.22 in #1409.

We want to find the most recent olm version working with ramen so we don't continue to use ancient version. This will also help to find why ramen does not work with 0.27, but looking at changes since last working version and first broken version.

Test with olm 0.23, 0.24, 0.25, 0.26:

Alternative: write a self test that will fail quicker without deploying ramen and running full dr flow. See #1410

abhijeet219 commented 1 month ago

Hi @nirs I tried basic-test/run on all versions of olm: 0.23, 0.24, 0.25, 0.26, 0.27. And its passing successfully for all of them.

Logs for the test run of olm v0.27.0:

$ kubectl get -n olm csvs
NAME            DISPLAY          VERSION   REPLACES   PHASE
packageserver   Package Server   v0.27.0              Succeeded
(ramen) [abshakya@abshakya ramen]$
(ramen) [abshakya@abshakya ramen]$
(ramen) [abshakya@abshakya ramen]$
(ramen) [abshakya@abshakya ramen]$ test/basic-test/run test/envs/regional-dr.yaml
2024-06-03 04:15:22,715 INFO    [deploy] Deploying application
2024-06-03 04:15:22,716 INFO    [deploy] Deploying application 'deployment-rbd'
2024-06-03 04:15:24,989 INFO    [deploy] Waiting for 'placement.cluster.open-cluster-management.io/placement' decisions
2024-06-03 04:15:25,308 INFO    [deploy] Application running on cluster 'dr1'
2024-06-03 04:15:25,467 INFO    [enable-dr] Enable DR
2024-06-03 04:15:25,555 INFO    [enable-dr] Disabling OCM scheduling for 'placement.cluster.open-cluster-management.io/placement'
2024-06-03 04:15:25,765 INFO    [enable-dr] Waiting for 'placement.cluster.open-cluster-management.io/placement' decisions
2024-06-03 04:15:26,187 INFO    [enable-dr] waiting for namespace deployment-rbd
2024-06-03 04:15:26,348 INFO    [enable-dr] Waiting until 'drplacementcontrol.ramendr.openshift.io/deployment-rbd-drpc' reports status
2024-06-03 04:15:26,752 INFO    [enable-dr] Waiting for 'drplacementcontrol.ramendr.openshift.io/deployment-rbd-drpc' Available condition
2024-06-03 04:15:26,970 INFO    [enable-dr] Waiting for 'drplacementcontrol.ramendr.openshift.io/deployment-rbd-drpc' PeerReady condition
2024-06-03 04:15:27,183 INFO    [enable-dr] Waiting for 'drplacementcontrol.ramendr.openshift.io/deployment-rbd-drpc' first replication
2024-06-03 04:16:56,253 INFO    [enable-dr] DR enabled
2024-06-03 04:16:56,435 INFO    [failover] Fail over application
2024-06-03 04:16:56,545 INFO    [failover] Waiting for 'drplacementcontrol.ramendr.openshift.io/deployment-rbd-drpc' Available condition
2024-06-03 04:16:56,761 INFO    [failover] Waiting for 'drplacementcontrol.ramendr.openshift.io/deployment-rbd-drpc' PeerReady condition
2024-06-03 04:16:56,979 INFO    [failover] Waiting for 'drplacementcontrol.ramendr.openshift.io/deployment-rbd-drpc' first replication
2024-06-03 04:16:57,184 INFO    [failover] Waiting for 'placement.cluster.open-cluster-management.io/placement' decisions
2024-06-03 04:16:57,554 INFO    [failover] Starting failover for 'drplacementcontrol.ramendr.openshift.io/deployment-rbd-drpc' to cluster 'dr2'
2024-06-03 04:16:57,782 INFO    [failover] Waiting for 'drplacementcontrol.ramendr.openshift.io/deployment-rbd-drpc' Available condition
2024-06-03 04:17:26,429 INFO    [failover] Waiting for 'drplacementcontrol.ramendr.openshift.io/deployment-rbd-drpc' PeerReady condition
2024-06-03 04:20:26,280 INFO    [failover] Waiting for 'drplacementcontrol.ramendr.openshift.io/deployment-rbd-drpc' first replication
2024-06-03 04:20:26,418 INFO    [failover] Application was failed over
2024-06-03 04:20:26,587 INFO    [relocate] Relocate application
2024-06-03 04:20:26,692 INFO    [relocate] Waiting for 'drplacementcontrol.ramendr.openshift.io/deployment-rbd-drpc' Available condition
2024-06-03 04:20:26,919 INFO    [relocate] Waiting for 'drplacementcontrol.ramendr.openshift.io/deployment-rbd-drpc' PeerReady condition
2024-06-03 04:20:27,124 INFO    [relocate] Waiting for 'drplacementcontrol.ramendr.openshift.io/deployment-rbd-drpc' first replication
2024-06-03 04:20:27,326 INFO    [relocate] Waiting for 'placement.cluster.open-cluster-management.io/placement' decisions
2024-06-03 04:20:27,707 INFO    [relocate] Starting relocate for 'drplacementcontrol.ramendr.openshift.io/deployment-rbd-drpc' to cluster 'dr1'
2024-06-03 04:20:27,937 INFO    [relocate] Waiting for 'drplacementcontrol.ramendr.openshift.io/deployment-rbd-drpc' phase 'Relocated'
2024-06-03 04:22:56,351 INFO    [relocate] Waiting for 'drplacementcontrol.ramendr.openshift.io/deployment-rbd-drpc' Available condition
2024-06-03 04:22:56,565 INFO    [relocate] Waiting for 'drplacementcontrol.ramendr.openshift.io/deployment-rbd-drpc' PeerReady condition
2024-06-03 04:23:26,426 INFO    [relocate] Waiting for 'drplacementcontrol.ramendr.openshift.io/deployment-rbd-drpc' first replication
2024-06-03 04:23:26,547 INFO    [relocate] Application was relocated
2024-06-03 04:23:26,707 INFO    [disable-dr] Disable DR
2024-06-03 04:23:26,800 INFO    [disable-dr] Deleting 'drplacementcontrol.ramendr.openshift.io/deployment-rbd-drpc'
2024-06-03 04:23:56,240 INFO    [disable-dr] Enabling OCM scheduling for 'placement.cluster.open-cluster-management.io/placement'
2024-06-03 04:23:56,360 INFO    [disable-dr] DR was disabled
2024-06-03 04:23:56,521 INFO    [undeploy] Deleting application
2024-06-03 04:23:56,521 INFO    [undeploy] Undeploying application 'deployment-rbd'
2024-06-03 04:24:03,885 INFO    [undeploy] Application was deleted
(ramen) [abshakya@abshakya ramen]$

So, should we update the olm version to 0.27 now?

nirs commented 1 month ago

@abhijeet219 We know that 0.27 does not work - it failed the CI when we tried it, so this test is likely wrong.

Did you follow all the steps? unconfig and undeploy ramen before testing new olm version?

If you did, maybe we must test recreating the entire environment for each olm version. The easiest way to test, is to submit a pr like #1436 upgrading olm and using versions cache file.

We can bump the version until the CI breaks.

abhijeet219 commented 1 month ago

Yes I followed the steps for unconfig and undeploy before testing new olm version.

Ok. Setting up new environment each time make sense, and raising pr seems to be a good way of doing it. Lets see..