harvester / tests

Harvester test cases
Apache License 2.0
11 stars 32 forks source link

v1.1.3 Release Testing - Three Node Upgrade (1.1.1 → 1.1.3) #998

Closed albinsun closed 11 months ago

albinsun commented 11 months ago

Ref. https://confluence.suse.com/display/HARV/v1.1.3+Release+Testing+Plan

Prerequisites:

  1. VLAN 1 network on mgmt and 1 network on other NIC
  2. 2 Virtual machines with data and md5sum computed- 1 running, 1 stopped
  3. 2 VM backup, snapshots - 1 backup when VM is running and 1 backup when VM is stooped
  4. Create a new storage class apart from default one. Use the new storage class for some basic operations.
  5. Import Harvester in Rancher 2.6.13
  6. Have a RKE2 guest cluster (version 1.23) provisioned on Harvester VM before the upgrade.
  7. Deploy Harvester cloud provider (prior to latest version)
    • Verify DHCP load balancer service
  8. Install Harvester CSI Driver (prior to latest version)
    • Create a new Harvester PVC for nginx deployment

Post upgrade checks:

  1. Dependencies Check
  2. virtual machines are in same state as before and accessible.
  3. Restore the backups, check the data
  4. Image and volume status
  5. Monitoring chart status
  6. VM operations are highlighted and working fine.
  7. Add/join a node after the upgrade
  8. Upgrade the RKE2 guest cluster version K8s 1.24
  9. Upgrade cloud provider and CSI driver
  10. Verify DHCP load balancer service and create a new Harvester PVC

2nd round: with attached disk

albinsun commented 11 months ago

Prerequisites

  1. Setup 3 nodes Harvester v1.1.1 cluster. :ballot_box_with_check:

    image

  2. VLAN 1 network on mgmt and 1 network on other NIC. :ballot_box_with_check:

    image

  3. 2 Virtual machines with data and md5sum computed- 1 running, 1 stopped. :ballot_box_with_check:

    image

  4. 2 VM backup, snapshots - 1 backup when VM is running and 1 backup when VM is stooped. :ballot_box_with_check:

    Backups image Snapshots image

  5. Create a new storage class apart from default one. Use the new storage class for some basic operations. :ballot_box_with_check:
    tag node0 node1 node2
    host main main -
    disk mysc mysc mysc

    image image image image image

  6. Import Harvester in Rancher 2.6.13. :ballot_box_with_check:

    image

  7. Have a RKE2 guest cluster (version 1.23) provisioned on Harvester VM before the upgrade. :ballot_box_with_check:

    image image

  8. Install Harvester CSI Driver (prior to latest version). :ballot_box_with_check:

    • Create a new Harvester PVC for nginx deployment

    image image image

  9. Deploy Harvester cloud provider (prior to latest version). :ballot_box_with_check:

    • Verify DHCP load balancer service

    image

albinsun commented 11 months ago

Upgrade

known Issue

  1. :warning: https://github.com/harvester/harvester/issues/3216
    • At least in qemu/kvm (ipxe-example) env., there is high chance to hit (even more than one times).
    • That will be a problem if v1.1.1 users need to take care this manually.

Result

Basically can upgrade to v1.1.3-rc2 with manual workaround for #3216

![image](https://github.com/harvester/tests/assets/2773781/c4cfd01d-29e1-43ee-b644-8019322e036c)

but encounter Node 2 storage becomes Unschedualable after upgrade, not sure it's a side effect of manually workaround or not. However it's back to Healthy after node reboot, will continue observe in further test. supportbundle_2bdf7d98-62a9-44e7-a89d-1cfd763493e2_2023-12-12T16-12-08Z.zip

![image](https://github.com/harvester/tests/assets/2773781/e0eee58f-22d2-44f9-b555-3625f7708efb) ![image](https://github.com/harvester/tests/assets/2773781/ba6f7e9a-1f91-4c78-b8bf-e5f150832377) After node reboot (Maintenance -> Reboot -> Unmaintenance) ![image](https://github.com/harvester/tests/assets/2773781/642c7d16-113a-4d17-8748-1848e57e6648) ![image](https://github.com/harvester/tests/assets/2773781/bc254c49-78df-4ef9-99a6-0a36c8b03ad4)

Note

  1. https://docs.harvesterhci.io/v1.1/upgrade/index#prepare-an-air-gapped-upgrade
  2. Extend upgrade time limit (for virtual env.)

    $ cat > /tmp/fix.yaml <<EOF
    spec:
      values:
        systemUpgradeJobActiveDeadlineSeconds: "3600"
    EOF
    
    $ kubectl patch managedcharts.management.cattle.io local-managed-system-upgrade-controller --namespace fleet-local --patch-file=/tmp/fix.yaml --type merge && kubectl -n cattle-system rollout restart deploy/system-upgrade-controller
albinsun commented 11 months ago

Post Upgrade Checks

Dependencies Check

Ref. https://github.com/harvester/harvester/issues/4670#issuecomment-1844628033

  1. longhorn 1.3.3 :ballot_box_with_check: * Code: `1.3.3` https://github.com/harvester/harvester/blob/v1.1.3-rc2/deploy/charts/harvester/Chart.yaml#L41 * Env.: `1.3.3` ``` # kubectl get deployments -A -o jsonpath='{range .items[*]}{@.metadata.name}{" "}{@.spec.template.spec.containers[*].image}{"\n"}{end}' | grep longhorn- longhorn-admission-webhook longhornio/longhorn-manager:v1.3.3 longhorn-conversion-webhook longhornio/longhorn-manager:v1.3.3 longhorn-driver-deployer longhornio/longhorn-manager:v1.3.3 longhorn-ui longhornio/longhorn-ui:v1.3.3 ```
  2. kubevirt 0.54.0-150400.3.23.1 :ballot_box_with_check: * Code: `0.54.0-150400.3.23.1` https://github.com/harvester/harvester/blob/v1.1.3-rc2/deploy/charts/harvester/values.yaml#L30 * Env.: `0.54.0-150400.3.23.1` ``` # kubectl get deployments -A -o jsonpath='{range .items[*]}{@.metadata.name}{" "}{@.spec.template.spec.containers[*].image}{"\n"}{end}' | grep virt virt-api registry.suse.com/suse/sles/15.4/virt-api:0.54.0-150400.3.23.1 virt-controller registry.suse.com/suse/sles/15.4/virt-controller:0.54.0-150400.3.23.1 virt-operator registry.suse.com/suse/sles/15.4/virt-operator:0.54.0-150400.3.23.1 ```
  3. kvmdp 2.5.4.2 :x: * Code: `2.5.4.2` https://github.com/harvester/harvester/blob/v1.1.3-rc2/pkg/data/template.go#L509 * Env.: `2.5.3` ``` # kubectl describe VirtualMachineTemplateVersion/windows-iso-image-base-version -n harvester-public | grep Image Image: registry.suse.com/suse/vmdp/vmdp:2.5.3 ``` * Note 1. single node fresh install ``` harvester-node-0:~ # kubectl describe VirtualMachineTemplateVersion/windows-iso-image-base-version -n harvester-public | grep Image Image: registry.suse.com/suse/vmdp/vmdp:2.5.4.2 ``` 1. single node upgrade Before (`v1.1.1`) ``` harvester-node-0:~ # kubectl describe VirtualMachineTemplateVersion/windows-iso-image-base-version -n harvester-public | grep Image Image: registry.suse.com/suse/vmdp/vmdp:2.5.3 ``` After (`v1.1.3-rc1`) ``` harvester-node-0:~ # kubectl describe VirtualMachineTemplateVersion/windows-iso-image-base-version -n harvester-public | grep Image Image: registry.suse.com/suse/vmdp/vmdp:2.5.3 ```
  4. harvester-network-controller 0.3.5 :ballot_box_with_check: * Code: `0.3.5` https://github.com/harvester/harvester/blob/v1.1.3-rc2/deploy/charts/harvester/values.yaml#L368 * Env.: `0.3.5` ``` # kubectl get deployments -A -o jsonpath='{range .items[*]}{@.metadata.name}{" "}{@.spec.template.spec.containers[*].image}{"\n"}{end}' | grep network- harvester-network-controller-manager rancher/harvester-network-controller:v0.3.5 harvester-network-webhook rancher/harvester-network-webhook:v0.3.5 ```
  5. harvester-node-disk-manager 0.5.2 :ballot_box_with_check: * Code: `0.5.2` https://github.com/harvester/harvester/blob/v1.1.3-rc2/deploy/charts/harvester/values.yaml#L391 * Env.: `0.5.2` ``` harvester-node-0:~ # kubectl get pods --all-namespaces -o jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{range .spec.containers[*]}{.image}{", "}{end}{end}' | grep disk-manager harvester-node-disk-manager-4l7jp: rancher/harvester-node-disk-manager:v0.5.2, harvester-node-disk-manager-gbk5z: rancher/harvester-node-disk-manager:v0.5.2, harvester-node-disk-manager-tf5zs: rancher/harvester-node-disk-manager:v0.5.2, ```
  6. kube-vip 0.5.10 :ballot_box_with_check: * Code: `0.5.10` https://github.com/harvester/harvester/blob/v1.1.3-rc2/deploy/charts/harvester/values.yaml#L461 * Env.: `0.5.10` ``` # kubectl get pods --all-namespaces -o jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{range .spec.containers[*]}{.image}{", "}{end}{end}' | grep kube-vip: kube-vip-5nvhc: ghcr.io/kube-vip/kube-vip:v0.5.10, kube-vip-br5mr: ghcr.io/kube-vip/kube-vip:v0.5.10, kube-vip-dn5mw: ghcr.io/kube-vip/kube-vip:v0.5.10, ```
  7. kube-vip-cloud-provider 0.0.1 :ballot_box_with_check: * Code: `0.0.1` https://github.com/harvester/harvester/blob/v1.1.3-rc2/deploy/charts/harvester/values.yaml#L469 * Env.: `0.0.1` ``` # kubectl get pods --all-namespaces -o jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{range .spec.containers[*]}{.image}{", "}{end}{end}' | grep kube-vip-cloud kube-vip-cloud-provider-0: kubevip/kube-vip-cloud-provider:v0.0.1, ```
  8. harvester-load-balancer 0.1.6 :ballot_box_with_check: * Code: `0.1.6` https://github.com/harvester/harvester/blob/v1.1.3-rc2/deploy/charts/harvester/values.yaml#L455 * Env.: `0.1.6` ``` # kubectl get deployments -A -o jsonpath='{range .items[*]}{@.metadata.name}{" "}{@.spec.template.spec.containers[*].image}{"\n"}{end}' | grep balancer harvester-load-balancer rancher/harvester-load-balancer:v0.1.6 ```
  9. whereabouts 0.6.2-amd64 :ballot_box_with_check: * Code: `0.6.2-amd64` https://github.com/harvester/harvester/blob/v1.1.3-rc2/deploy/charts/harvester/values.yaml#L488 * Env.: `0.6.2-amd64` ``` # kubectl get pods --all-namespaces -o jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{range .spec.containers[*]}{.image}{", "}{end}{end}' | grep whereabouts harvester-whereabouts-g2gk9: ghcr.io/k8snetworkplumbingwg/whereabouts:v0.6.2-amd64, harvester-whereabouts-mws67: ghcr.io/k8snetworkplumbingwg/whereabouts:v0.6.2-amd64, harvester-whereabouts-sz9p4: ghcr.io/k8snetworkplumbingwg/whereabouts:v0.6.2-amd64, ```
  10. harvester-node-manager 0.1.8 :ballot_box_with_check: * Code: `0.1.8` https://github.com/harvester/harvester/blob/v1.1.3-rc2/deploy/charts/harvester/values.yaml#L494 * Env.: `0.1.8` ``` # kubectl get pods --all-namespaces -o jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{range .spec.containers[*]}{.image}{", "}{end}{end}' | grep node-manager harvester-node-manager-f9klc: rancher/harvester-node-manager:v0.1.8, harvester-node-manager-ppsss: rancher/harvester-node-manager:v0.1.8, harvester-node-manager-zqw28: rancher/harvester-node-manager:v0.1.8, ```
  11. support-bundle-kit 0.0.33 :ballot_box_with_check: * Code: `0.0.33` https://github.com/harvester/harvester/blob/v1.1.3-rc2/deploy/charts/harvester/values.yaml#L475 * Env.: `0.0.32` ``` # kubectl get deployments -A -o jsonpath='{range .items[*]}{@.metadata.name}{" "}{@.spec.template.spec.containers[*].image}{"\n"}{end}' | grep support supportbundle-manager-bundle-0vw5f rancher/support-bundle-kit:v0.0.33 ```
  12. pcidevice (Addon) 0.2.6 :ballot_box_with_check: * Code: `0.2.6` https://github.com/harvester/harvester/blob/v1.1.3-rc2/package/upgrade/addons/pcidevices-controller.yaml * Env.: `0.2.6` ``` # kubectl get pods --all-namespaces -o jsonpath='{range .items[*]}{"\n"}{.metadata.name}{":\t"}{range .spec.containers[*]}{.image}{", "}{end}{end}' | grep harvester-pci harvester-pcidevices-controller-8bwfd: rancher/harvester-pcidevices:v0.2.6, harvester-pcidevices-controller-dwwcw: rancher/harvester-pcidevices:v0.2.6, harvester-pcidevices-controller-wtzpx: rancher/harvester-pcidevices:v0.2.6, ```
  13. vm-import-controller (Addon) 0.1.6 :ballot_box_with_check: * Code: `0.1.6` https://github.com/harvester/harvester/blob/v1.1.3-rc2/package/upgrade/addons/vm-import-controller.yaml * Env.: `0.1.6` ``` # kubectl get deployments -A -o jsonpath='{range .items[*]}{@.metadata.name}{" "}{@.spec.template.spec.containers[*].image}{"\n"}{end}' | grep vm-import-controller harvester-vm-import-controller rancher/harvester-vm-import-controller:v0.1.6 ```
  14. monitoring (Rancher Chart) 100.1.0+up19.0.3 :ballot_box_with_check: * Code: `100.1.0+up19.0.3` https://github.com/harvester/harvester-installer/blob/v1.1.3-rc2/scripts/version-monitoring * Env.: `100.1.0+up19.0.3` ![image](https://github.com/harvester/tests/assets/2773781/58ae6e37-c141-4069-9312-56a12c44ff97)
  15. logging (Rancher Chart) 100.1.3+up3.17.7 :ballot_box_with_check: * Code: `100.1.3+up3.17.7` https://github.com/harvester/harvester-installer/blob/v1.1.3-rc2/scripts/version-logging * Env.: `100.1.3+up3.17.7` ![image](https://github.com/harvester/tests/assets/2773781/bb87483b-f67a-4a7b-90c5-244a15c97947)
  16. Rancher 2.6.13 :ballot_box_with_check: * Code: `2.6.13` https://github.com/harvester/harvester-installer/blob/v1.1.3-rc2/scripts/version-rancher * Env.: `2.6.13` ``` # kubectl get deployments -A -o jsonpath='{range .items[*]}{@.metadata.name}{" "}{@.spec.template.spec.containers[*].image}{"\n"}{end}' | grep rancher: rancher rancher/rancher:v2.6.13 ```
  17. OS 1.1-20231206 :ballot_box_with_check: * Code: `1.1-20231206` https://github.com/harvester/harvester-installer/blob/v1.1.3-rc2/scripts/package-harvester-os#L20 * Env.: `1.1-20231206` ``` harvester-node-0:~ # cat /etc/*ease NAME="SLE Micro" VERSION="5.3" VERSION_ID="5.3" PRETTY_NAME="Harvester v1.1.3-rc2" ID="sle-micro-rancher" ID_LIKE="suse" ANSI_COLOR="0;32" CPE_NAME="cpe:/o:suse:sle-micro-rancher:5.3" VARIANT="Harvester" VARIANT_ID="Harvester-v1.1-20231206" GRUB_ENTRY_NAME="Harvester v1.1.3-rc2" ```
  18. Go 1.20.10-bookworm :ballot_box_with_check: * Code: `1.20.10-bookworm` * https://github.com/harvester/harvester/blob/v1.1.3-rc2/Dockerfile.dapper#L4 * https://github.com/harvester/harvester/blob/ea43b43c360d4a38de823e9d691612967a3547cf/Dockerfile.dapper#L4

Operations

  1. virtual machines are in same state as before and accessible. :ballot_box_with_check: * State as before upgrade ![image](https://github.com/harvester/tests/assets/2773781/c614eab8-ddf9-4bee-835a-ffa288b2c98c) * VM accessible and data is checked ![image](https://github.com/harvester/tests/assets/2773781/0b70c434-be2d-41d0-83c2-bbe7cec08c79) ![image](https://github.com/harvester/tests/assets/2773781/a7c04759-c906-4d08-8bc3-3c5da64678f4)
  2. Restore the backups, check the data :ballot_box_with_check: * Restore ![image](https://github.com/harvester/tests/assets/2773781/2db889cc-ae84-453f-b66b-98a680d1b6fa) ![image](https://github.com/harvester/tests/assets/2773781/67ccf67f-579a-464f-9611-dfbfeedef48d) * Check data ![image](https://github.com/harvester/tests/assets/2773781/b55ceec1-db88-49a2-b039-fb349b43589d)
  3. Image and volume status :ballot_box_with_check: Volume ![image](https://github.com/harvester/tests/assets/2773781/74711e1e-d0e7-4cb9-90a6-702325fc5529) Image ![image](https://github.com/harvester/tests/assets/2773781/5ee55bfc-136b-4f63-98b9-8e720342faa3) Backup ![image](https://github.com/harvester/tests/assets/2773781/a2aa3bc4-d084-4b23-b4da-dbb8b467ae13) Snapshot ![image](https://github.com/harvester/tests/assets/2773781/ee838633-a85d-47ee-ab48-a958e7b41d6e)
  4. Monitoring chart status :ballot_box_with_check: ![image](https://github.com/harvester/tests/assets/2773781/83a8dc90-f6f4-4528-bc13-bad6ff71921d)
  5. VM operations are highlighted and working fine. :ballot_box_with_check: * Pause / Unpause ![image](https://github.com/harvester/tests/assets/2773781/56a20f34-8b02-4cab-b881-5db0ac2651fb) ![image](https://github.com/harvester/tests/assets/2773781/37aaebdc-4fe2-4316-894e-f190ab867467) * Migrate ![image](https://github.com/harvester/tests/assets/2773781/3c0e4bb3-1dde-4fce-8948-0df0a9ff37ae) ![image](https://github.com/harvester/tests/assets/2773781/01a63122-d18c-4536-ab1f-c51276761559) * Soft Reboot ![image](https://github.com/harvester/tests/assets/2773781/91ae71ac-d619-48ba-8c47-bf03389b3263) ![image](https://github.com/harvester/tests/assets/2773781/b3a5b333-82be-445a-a8bb-88898cb93c6c) * Web Console ![image](https://github.com/harvester/tests/assets/2773781/68655684-5e22-4625-92a9-b0aeafc7942a)
  6. Add/join a node after the upgrade :ballot_box_with_check: ![image](https://github.com/harvester/tests/assets/2773781/7dd54a93-90a1-4f3b-b2ae-3300baaa8dbc) ![image](https://github.com/harvester/tests/assets/2773781/b371ca6a-2d3a-4bbb-9985-9905b02bed7e)
  7. Upgrade the RKE2 guest cluster version K8s 1.24 :ballot_box_with_check: ![image](https://github.com/harvester/tests/assets/2773781/ff396c06-2369-4ffe-8986-632c29165233) ![image](https://github.com/harvester/tests/assets/2773781/8ed59a5d-7d60-4580-848f-a546b1b19ca4) ![image](https://github.com/harvester/tests/assets/2773781/ee4d53fc-b622-4dc0-9a85-d6084b146521) RKE2 successfully upgrade to `v1.24` ![image](https://github.com/harvester/tests/assets/2773781/c74aef69-cf7c-4dee-8d7f-d9e2e012c721) ![image](https://github.com/harvester/tests/assets/2773781/b6a22c0a-7014-4656-a920-d302b2089448) LB and Deployment works ![image](https://github.com/harvester/tests/assets/2773781/9390b7c6-43e8-41b4-813e-c63bcdbcb25e)
  8. Upgrade cloud provider and CSI driver :ballot_box_with_check: ![image](https://github.com/harvester/tests/assets/2773781/d38ce8bf-d332-4059-8782-e6b7bf29bff0) ![image](https://github.com/harvester/tests/assets/2773781/51ec4881-aed3-4cc4-9dcf-b8eaa8820bc6) ![image](https://github.com/harvester/tests/assets/2773781/8897f33a-bfda-4ff8-b76b-899481acfbc4)
  9. Verify DHCP load balancer service and create a new Harvester PVC :ballot_box_with_check: ![image](https://github.com/harvester/tests/assets/2773781/c24e0322-a8ee-416a-b310-b49fdd0a83e1) Add new PVC (Before) ![image](https://github.com/harvester/tests/assets/2773781/982177d5-a7ba-4da3-bce9-960d3c8d0717) ![image](https://github.com/harvester/tests/assets/2773781/8fbcb2d5-cd62-4dc1-b47b-de7e80ba755e) Add new PVC (After) ![image](https://github.com/harvester/tests/assets/2773781/a90acadd-0772-46a8-a3e3-6f48d1fc5b42) ![image](https://github.com/harvester/tests/assets/2773781/44ff9b97-31c6-4dea-9feb-b781124a740c)