harvester / tests

Harvester test cases
Apache License 2.0
10 stars 31 forks source link

[TEST] v1.3.2 Release Testing: Two nodes upgrade #1502

Closed albinsun closed 1 month ago

albinsun commented 1 month ago

What's the test to develop? Please describe

Three node upgrade w/ 2 MGMT/Default Nodes & 1 Witness Node

Prerequisite and dependency of test

  1. Setup three nodes cluster w/ witness node
  2. VLAN 1 network on mgmt and 1 network on other NICs
  3. 2 Virtual machines with data and md5sum computed- 1 running, 1 stopped
  4. 2 VM backup, snapshots - 1 backup when VM is running and 1 backup when VM is stooped
  5. Create a new storage class apart from default one. Use the new storage class for some basic operations.
  6. Import to Rancher and create an RKE2 guest cluster provisioned on Harvester VM before the upgrade.
  7. Deploy Harvester cloud provider to RKE1 Cluster (prior to latest version)
  8. Verify DHCP load balancer service
  9. Install Harvester CSI Driver (prior to latest version)
  10. Create a new Harvester PVC for nginx deployment
  11. Upgrade Rancher
  12. Upgrade Harvester

Describe the items of the test development (DoD, definition of done) you'd like

  1. Dependencies Check
  2. Virtual machines are in same state as before and accessible.
  3. Restore the backups, check the data
  4. Image and volume status
  5. Monitoring chart status
  6. VM operations are highlighted and working fine.
  7. Import Harvester into a Rancher cluster
  8. Add a node after the upgrade
  9. Upgrade the RKE2 guest cluster to 1.27 or above
  10. Upgrade cloud provider and CSI driver
  11. Verify DHCP load balancer service and create a new Harvester PVC
  12. Shutting off VM and then restarting VM

Additional context

albinsun commented 1 month ago

Prerequisite and dependency of test

  1. :green_circle: Setup 2 nodes v1.3.1 * hosts ![image](https://github.com/user-attachments/assets/dc106462-8d54-4071-9f53-95c02090b8a7) * node-0 ![image](https://github.com/user-attachments/assets/31bff21a-689b-48c6-8f21-0e05c811868e) * node-1 ![image](https://github.com/user-attachments/assets/b1d0aa36-1310-4e21-8fb1-8c8205cac6c2)
  2. :green_circle: Create new storage class w/ replica 2 and set as default ![image](https://github.com/user-attachments/assets/8bf4e067-6979-442c-8bf6-bc3f0a92750d)
  3. :green_circle: VLAN 1 network on mgmt and 1 network on other NICs * Cluster Network ![image](https://github.com/user-attachments/assets/25ace574-207f-4098-b21a-96a63a2fb940) * VM Network ![image](https://github.com/user-attachments/assets/a9159d56-ccc3-440c-88c0-976844a63f33)
  4. :yellow_circle: 2 Virtual machines with data and md5sum computed- 2 running, 2 stopped * Images ![image](https://github.com/user-attachments/assets/5ece0b57-44c0-4ec0-a5f6-54bb58b93e1f) * VMs ![image](https://github.com/user-attachments/assets/18a445f8-e881-4d7c-8772-5786ae9d3963) * RHEL 9.4 (cloud-user@ipv4): Running, ens5-vlan1, extra disk ![image](https://github.com/user-attachments/assets/8fde4041-7d05-493d-a160-476661546000) * SLE Micro 6 [sles]: Stopped, mgmt-vlan1 ![image](https://github.com/user-attachments/assets/cce50022-e7ca-48ad-ac0d-4dbf00be5933) * Windows 11 enterprise: Running, mgmt-vlan1 (TPM, EFI and Secure Boot) 2C / 4G / 64G ![image](https://github.com/user-attachments/assets/d9944050-4ac4-4d8f-8c95-cfcaeb7f2b09) * Windows 2022 (no rke2 testing needed): Stopped, ens5-vlan1 (TPM, EFI and Secure Boot) * Insufficient Storage ![image](https://github.com/user-attachments/assets/d9d9a9d1-1482-4d2d-9e5d-fe0f7860c908)

    :warning: Temporarily skip windows VMs as insufficient local storage space. Will test seperately on lab env.

  5. :yellow_circle: 2 VM backup, snapshots - 1 when VM is running and 1 when VM is stooped * Snapshot ![image](https://github.com/user-attachments/assets/95549d8b-d167-4a18-bc23-d872c4b6cd97) * Backup ![image](https://github.com/user-attachments/assets/d685388f-0ff7-4915-adf7-799a78048765)

    harvester/harvester/issues/6497

  6. :green_circle: Import to Rancher and create an RKE2 (v1.27) guest cluster provisioned on Harvester VM before the upgrade. * Import Harvester to Rancher ![image](https://github.com/user-attachments/assets/fe78cfb1-a473-4544-be17-6e21e01a2270) * Provision RKE2 guest cluster ![image](https://github.com/user-attachments/assets/a664c84d-3270-4a5a-ae0a-0345e9c7c3a1)
  7. :green_circle: Deploy Harvester cloud provider to RKE2 Cluster (prior to latest version) * rke2-rhel9 ![image](https://github.com/user-attachments/assets/235caddb-8faf-4e8f-b4c5-cb75399f5b6f)
  8. :green_circle: Install Harvester CSI Driver (prior to latest version) * rke2-rhel9 ![image](https://github.com/user-attachments/assets/0c2eb377-3328-4477-81cb-f571de0e0b6d) * rke2-slm6 ![image](https://github.com/user-attachments/assets/bbd4aff7-6514-46ea-9306-e274c5240af2)
  9. :green_circle: Create a new Harvester PVC for nginx deployment * Default storage class * rke2-rhel9 ![image](https://github.com/user-attachments/assets/390e09cf-24cb-4154-b6ed-8cb830bfd2a6) * rke2-slm6 ![image](https://github.com/user-attachments/assets/cc9b6eec-cee5-423f-a9b2-f064235eed56) * Create PVC * rke2-rhel9 ![image](https://github.com/user-attachments/assets/3d4272df-2441-4d39-8d4b-03bfdeaf0318) * rke2-slm6 ![image](https://github.com/user-attachments/assets/447870fd-e210-444e-97f3-6905724be57d)
  10. :green_circle: Verify DHCP load balancer service * rke2-rhel9 ![image](https://github.com/user-attachments/assets/c6c6d59c-de42-4116-b6f6-d155798156e6) ![image](https://github.com/user-attachments/assets/da9c68be-50c5-455e-a834-5becdd867cb1) * rke2-slm6 ![image](https://github.com/user-attachments/assets/27d5b89a-6cb6-4457-82cf-704c965d7c3e) ![image](https://github.com/user-attachments/assets/5bb0d287-7a0a-4cba-b476-315dbeff831e)
  11. :green_circle: Upgrade to Rancher v2.8.6 * rke2-rhel9 * RKE2 cluster still running ![image](https://github.com/user-attachments/assets/268a38d8-11f5-4c29-9dc8-505a9531f002) * nginx still works ![image](https://github.com/user-attachments/assets/99384d96-d407-4288-8c33-c187ed3a49ad) * rke2-slm6 * upgrade rancher ![image](https://github.com/user-attachments/assets/0ecda5a7-28b5-45a9-bb45-813e1420e4c3) * RKE2 cluster still running ![image](https://github.com/user-attachments/assets/882d369c-bd37-4be5-be41-15a0c8d29a74) * nginx still works ![image](https://github.com/user-attachments/assets/f8d79424-81a0-4cac-adb3-6ae5e616175d)

    Ref. https://ranchermanager.docs.rancher.com/getting-started/installation-and-upgrade/install-upgrade-on-a-kubernetes-cluster/upgrades

  12. :green_circle: Upgrade to RKE2 v1.28 * rke2-rhel9 * upgrade RKE2 version ![image](https://github.com/user-attachments/assets/569bc392-9119-4496-89dd-c81855e1057a) * RKE2 cluster still running ![image](https://github.com/user-attachments/assets/bc399c29-cb05-4508-8769-832d5ddbacea) * nginx still works ![image](https://github.com/user-attachments/assets/99846ca1-4de8-48c4-9c88-041c5d8406f1) ![image](https://github.com/user-attachments/assets/06222836-8f2b-48c7-afb4-cfe554e47ea0) * rke2-slm6 * upgrade RKE2 version ![image](https://github.com/user-attachments/assets/ab8d5dba-0a33-4e83-a48e-96e7f3e55060) * RKE2 cluster still running ![image](https://github.com/user-attachments/assets/5a6eef0b-540c-4735-9bc9-1e487ca6abc4) * nginx still works ![image](https://github.com/user-attachments/assets/59f67c70-e484-45ad-92cb-b8438f584e86)
albinsun commented 1 month ago

Upgrade to harvester-v1.3.1 to harvester-v1.3.2-rc2

  1. :green_circle: Prepare an airgapped upgrade ![image](https://github.com/user-attachments/assets/3e2721a3-f55f-4320-8a52-d04fa7b97fdd)

    Ref. https://docs.harvesterhci.io/v1.3/upgrade/index/#prepare-an-air-gapped-upgrade

  2. :green_circle: Trigger Upgrade ![image](https://github.com/user-attachments/assets/39b03cf2-ea78-4f24-b09d-2c2d89d0a835)
  3. :green_circle: Successfully upgrade ![image](https://github.com/user-attachments/assets/30d98128-1a70-486e-9e4b-a71bc784bbcf)

    :warning: Has chance to hit harvester/harvester/issues/6432

albinsun commented 1 month ago

Post-upgrade Checks

  1. :green_circle: Dependencies Check ``` $ python3 check_chart_version.py -cac -ip 192.168.0.30 ... For /home/rancher/harvester_yamls/crd-kubevirt.yaml - Observation: v1 v1alpha3 Expectation: v1 v1alpha3 True For /home/rancher/harvester_yamls/harvesterhci.io_addons.yaml - Observation: v1beta1 Expectation: v1beta1 True For /home/rancher/harvester_yamls/harvesterhci.io_keypairs.yaml - Observation: v1beta1 Expectation: v1beta1 True For /home/rancher/harvester_yamls/harvesterhci.io_preferences.yaml - Observation: v1beta1 Expectation: v1beta1 True For /home/rancher/harvester_yamls/harvesterhci.io_settings.yaml - Observation: v1beta1 Expectation: v1beta1 True For /home/rancher/harvester_yamls/harvesterhci.io_supportbundles.yaml - Observation: v1beta1 Expectation: v1beta1 True For /home/rancher/harvester_yamls/harvesterhci.io_upgradelogs.yaml - Observation: v1beta1 Expectation: v1beta1 True For /home/rancher/harvester_yamls/harvesterhci.io_upgrades.yaml - Observation: v1beta1 Expectation: v1beta1 True For /home/rancher/harvester_yamls/harvesterhci.io_versions.yaml - Observation: v1beta1 Expectation: v1beta1 True For /home/rancher/harvester_yamls/harvesterhci.io_virtualmachinebackups.yaml - Observation: v1beta1 Expectation: v1beta1 True For /home/rancher/harvester_yamls/harvesterhci.io_virtualmachineimages.yaml - Observation: v1beta1 Expectation: v1beta1 True For /home/rancher/harvester_yamls/harvesterhci.io_virtualmachinerestores.yaml - Observation: v1beta1 Expectation: v1beta1 True For /home/rancher/harvester_yamls/harvesterhci.io_virtualmachinetemplates.yaml - Observation: v1beta1 Expectation: v1beta1 True For /home/rancher/harvester_yamls/harvesterhci.io_virtualmachinetemplateversions.yaml - Observation: v1beta1 Expectation: v1beta1 True For /home/rancher/harvester_yamls/volumesnapshotclasses.yaml - Observation: v1 v1beta1 Expectation: v1 v1beta1 True For /home/rancher/harvester_yamls/volumesnapshotcontents.yaml - Observation: v1 v1beta1 Expectation: v1 v1beta1 True For /home/rancher/harvester_yamls/volumesnapshots.yaml - Observation: v1 v1beta1 Expectation: v1 v1beta1 True For /home/rancher/harvester_yamls/whereabouts.cni.cncf.io_ippools.yaml - Observation: v1alpha1 Expectation: v1alpha1 True For /home/rancher/harvester_yamls/whereabouts.cni.cncf.io_overlappingrangeipreservations.yaml - Observation: v1alpha1 Expectation: v1alpha1 True ```
  2. :green_circle: Virtual machines are in same state as before and accessible. ![image](https://github.com/user-attachments/assets/ac0f5fdc-bdd1-4e2e-8820-a4cf2ee5a117)
  3. :green_circle: Restore the backups, check the data Restore New ![image](https://github.com/user-attachments/assets/9c8ddcc8-d558-4f3e-8725-7f9dcd4d7f57) Restore Replace ![image](https://github.com/user-attachments/assets/73745722-9e0b-4ff4-a3bd-efce4089a54f) ![image](https://github.com/user-attachments/assets/dcd1e992-4d9f-43c5-a6f4-66e2641a5608)
  4. :green_circle: Image and volume status * Image ![image](https://github.com/user-attachments/assets/2ccd3f2b-d1a8-453b-a41f-4b479070946b) * Volume ![image](https://github.com/user-attachments/assets/c74a4e80-09d7-43e1-a1ba-71be6c554c0f)
  5. :green_circle: VM operations are highlighted and working fine. - [x] Restart - [x] Soft Reboot - [x] Migrate to default node - [x] Add Volume
  6. :green_circle: RKE2 guest cluster still works * cluster ![image](https://github.com/user-attachments/assets/52bc8312-3366-417a-8a99-ccd31016f152) * LB ![image](https://github.com/user-attachments/assets/9fd237cf-575f-4ce5-a88f-297ce5fb3ff8)
  7. :green_circle: Upgrade cloud provider and CSI driver ![image](https://github.com/user-attachments/assets/0eb2f8ee-c752-42ea-8fe0-59f3dfe8f596) * cloud-provider ![image](https://github.com/user-attachments/assets/afc2cfcc-9e2a-4198-b6df-6dab32bd8bbf) * csi-driver ![image](https://github.com/user-attachments/assets/834bccaf-4598-47f0-9cc9-8eb4407d3b4f)
  8. :green_circle: Verify DHCP load balancer service and create a new Harvester PVC * LB ![image](https://github.com/user-attachments/assets/9fd237cf-575f-4ce5-a88f-297ce5fb3ff8) * New PVC ![image](https://github.com/user-attachments/assets/25acd526-e228-4b32-8370-df31fb7e0da2)
  9. :green_circle: Shutting off VM and then restarting VM - [x] Stop - [x] Start
  10. :green_circle: Add a (Default) node after the upgrade ![image](https://github.com/user-attachments/assets/efb049c0-e817-48a5-bd1b-11242d2eb4c4)
albinsun commented 1 month ago

Windows VMs will be test seperately.

albinsun commented 1 month ago

FYI, try reproduce harvester/harvester#6432 3 times but not hit

Note that we can refer workaround in harvester/docs/pull/635 if hit again.

image

albinsun commented 3 weeks ago

Update RHEL9 result on harvester-v1.3.2 + rancher-v2.7.15 (v1.27) -> rancher-v2.8.6 (v1.28)

  1. :green_circle: Deploy Harvester cloud provider to RKE2 Cluster (prior to latest version) * rke2-rhel9 ![image](https://github.com/user-attachments/assets/235caddb-8faf-4e8f-b4c5-cb75399f5b6f)
  2. :green_circle: Install Harvester CSI Driver (prior to latest version) * rke2-rhel9 ![image](https://github.com/user-attachments/assets/0c2eb377-3328-4477-81cb-f571de0e0b6d) * rke2-slm6 ![image](https://github.com/user-attachments/assets/bbd4aff7-6514-46ea-9306-e274c5240af2)
  3. :green_circle: Create a new Harvester PVC for nginx deployment * Default storage class * rke2-rhel9 ![image](https://github.com/user-attachments/assets/390e09cf-24cb-4154-b6ed-8cb830bfd2a6) * rke2-slm6 ![image](https://github.com/user-attachments/assets/cc9b6eec-cee5-423f-a9b2-f064235eed56) * Create PVC * rke2-rhel9 ![image](https://github.com/user-attachments/assets/3d4272df-2441-4d39-8d4b-03bfdeaf0318) * rke2-slm6 ![image](https://github.com/user-attachments/assets/447870fd-e210-444e-97f3-6905724be57d)
  4. :green_circle: Verify DHCP load balancer service * rke2-rhel9 ![image](https://github.com/user-attachments/assets/c6c6d59c-de42-4116-b6f6-d155798156e6) ![image](https://github.com/user-attachments/assets/da9c68be-50c5-455e-a834-5becdd867cb1) * rke2-slm6 ![image](https://github.com/user-attachments/assets/27d5b89a-6cb6-4457-82cf-704c965d7c3e) ![image](https://github.com/user-attachments/assets/5bb0d287-7a0a-4cba-b476-315dbeff831e)
  5. :green_circle: Upgrade to Rancher v2.8.6 * rke2-rhel9 * RKE2 cluster still running ![image](https://github.com/user-attachments/assets/268a38d8-11f5-4c29-9dc8-505a9531f002) * nginx still works ![image](https://github.com/user-attachments/assets/99384d96-d407-4288-8c33-c187ed3a49ad) * rke2-slm6 * upgrade rancher ![image](https://github.com/user-attachments/assets/0ecda5a7-28b5-45a9-bb45-813e1420e4c3) * RKE2 cluster still running ![image](https://github.com/user-attachments/assets/882d369c-bd37-4be5-be41-15a0c8d29a74) * nginx still works ![image](https://github.com/user-attachments/assets/f8d79424-81a0-4cac-adb3-6ae5e616175d)

    Ref. https://ranchermanager.docs.rancher.com/getting-started/installation-and-upgrade/install-upgrade-on-a-kubernetes-cluster/upgrades

  6. :green_circle: Upgrade to RKE2 v1.28 * rke2-rhel9 * upgrade RKE2 version ![image](https://github.com/user-attachments/assets/569bc392-9119-4496-89dd-c81855e1057a) * RKE2 cluster still running ![image](https://github.com/user-attachments/assets/bc399c29-cb05-4508-8769-832d5ddbacea) * nginx still works ![image](https://github.com/user-attachments/assets/99846ca1-4de8-48c4-9c88-041c5d8406f1) ![image](https://github.com/user-attachments/assets/06222836-8f2b-48c7-afb4-cfe554e47ea0) * rke2-slm6 * upgrade RKE2 version ![image](https://github.com/user-attachments/assets/ab8d5dba-0a33-4e83-a48e-96e7f3e55060) * RKE2 cluster still running ![image](https://github.com/user-attachments/assets/5a6eef0b-540c-4735-9bc9-1e487ca6abc4) * nginx still works ![image](https://github.com/user-attachments/assets/59f67c70-e484-45ad-92cb-b8438f584e86)