kubermatic / kubermatic

Kubermatic Kubernetes Platform - the Central Kubernetes Management Platform For Any Infrastructure
https://www.kubermatic.com
Other
1.08k stars 162 forks source link

Azure cluster provisioning and upgrading - Test Release 2.25 #13033

Closed embik closed 7 months ago

embik commented 9 months ago

General Instructions

Run kubermatic/infra/blob/main/clusters/kkp-qa-env/conformance-tester.sh like PROVIDERS=azure conformance-tester.sh.

Checklist

### Supported Operating Systems
- [x] Ubuntu
- [x] CentOS 7
- [x] Flatcar
- [x] RHEL
- [x] Rocky Linux
### Supported Kubernetes Versions (use latest patch releases available)
- [x] v1.27
- [x] v1.28
- [x] v1.29
### Upgrades
- [ ] Upgrade from v1.27.x -> v1.28.x
- [ ] Upgrade from v1.28.x -> v1.29.x
csengerszabo commented 8 months ago

/assign @ronissac88

ronissac88 commented 8 months ago
=============================================================
Test results for: azure-ubuntu-1.28.7
----------------------------
Passed: 19
Failed: 0
=============================================================
=============================================================
Test results for: azure-flatcar-1.29.2
----------------------------
Passed: 19
Failed: 0
=============================================================
=============================================================
Test results for: azure-ubuntu-1.29.2
----------------------------
Passed: 19
Failed: 0
=============================================================
=============================================================
Test results for: azure-flatcar-1.27.11
----------------------------
Passed: 19
Failed: 0
=============================================================
=============================================================
Test results for: azure-centos-1.29.2
----------------------------
Passed: 19
Failed: 0
=============================================================
=============================================================
Test results for: azure-rockylinux-1.29.2
----------------------------
Passed: 19
Failed: 0
=============================================================
=============================================================
Test results for: azure-rhel-1.27.11
----------------------------
Passed: 19
Failed: 0
=============================================================
=============================================================
Test results for: azure-rhel-1.28.7
----------------------------
Passed: 19
Failed: 0
=============================================================
=============================================================
Test results for: azure-rhel-1.29.2
----------------------------
Passed: 19
Failed: 0
=============================================================
=============================================================
Test results for: azure-rockylinux-1.28.7
----------------------------
Passed: 19
Failed: 0
=============================================================
=============================================================
Test results for: azure-centos-1.28.7
----------------------------
Passed: 19
Failed: 0
=============================================================
=============================================================
Test results for: azure-centos-1.27.11
----------------------------
Passed: 19
Failed: 0
=============================================================
=============================================================
Test results for: azure-ubuntu-1.27.11
----------------------------
Passed: 19
Failed: 0
=============================================================
=============================================================
Test results for: azure-rockylinux-1.27.11
----------------------------
Passed: 19
Failed: 0
=============================================================
=============================================================
Test results for: azure-flatcar-1.28.7
----------------------------
Passed: 19
Failed: 0
=============================================================

========================== RESULT ===========================
Parameters:
  KKP Version............: v2.25.0-beta.1 (EE)
  Name Prefix............: "ronny"
  OSM Enabled............: true
  Dualstack Enabled......: false
  Konnectivity Enabled...: true
  Cluster Updates Enabled: false
  Enabled Tests..........: [gcr-images loadbalancer metrics securitycontext storage usercluster-gcr-images usercluster-rbac usercluster-seccomp]
  Scenario Options.......: []

Test results:
[ OK ] - azure-centos-1.27.11 (26m36s)
[ OK ] - azure-centos-1.28.7 (27m14s)
[ OK ] - azure-centos-1.29.2 (28m14s)
[ OK ] - azure-flatcar-1.27.11 (29m19s)
[ OK ] - azure-flatcar-1.28.7 (21m16s)
[ OK ] - azure-flatcar-1.29.2 (28m18s)
[ OK ] - azure-rhel-1.27.11 (31m31s)
[ OK ] - azure-rhel-1.28.7 (31m10s)
[ OK ] - azure-rhel-1.29.2 (28m29s)
[ OK ] - azure-rockylinux-1.27.11 (30m24s)
[ OK ] - azure-rockylinux-1.28.7 (29m52s)
[ OK ] - azure-rockylinux-1.29.2 (29m22s)
[ OK ] - azure-ubuntu-1.27.11 (27m17s)
[ OK ] - azure-ubuntu-1.28.7 (25m47s)
[ OK ] - azure-ubuntu-1.29.2 (24m50s)
2024-03-01T17:52:48.742+0530    info    conformance-tester/main.go:190  Test suite has completed successfully   {"runtime": "3h34m37.601276417s"}
xrstf commented 8 months ago

@ronissac88 please run the script again, but with UPDATE=true RELEASES=1.27,1.28 …, so it tests cluster upgrades.

ronissac88 commented 8 months ago
=============================================================
Test results for: azure-centos-1.27.11
[FAIL] - [KKP] Wait for control plane
         Get "https://hzqfp9kg6v.captain.captain.k8c.io:31386/apis/kubermatic.k8c.io/v1/clusters/ronny-xkf4h": http2: client connection lost
----------------------------
Passed: 19
Failed: 1
=============================================================
=============================================================
Test results for: azure-centos-1.28.7
[FAIL] - [KKP] Wait for Pods inside usercluster to be ready
         context deadline exceeded; last error was: not all Pods are ready: [data-writer-0]
----------------------------
Passed: 20
Failed: 1
=============================================================
=============================================================
Test results for: azure-rockylinux-1.28.7
----------------------------
Passed: 20
Failed: 0
=============================================================
=============================================================
Test results for: azure-rhel-1.28.7
----------------------------
Passed: 20
Failed: 0
=============================================================
=============================================================
Test results for: azure-rhel-1.27.11
----------------------------
Passed: 20
Failed: 0
=============================================================
=============================================================
Test results for: azure-flatcar-1.27.11
----------------------------
Passed: 20
Failed: 0
=============================================================
=============================================================
Test results for: azure-flatcar-1.28.7
----------------------------
Passed: 20
Failed: 0
=============================================================
=============================================================
Test results for: azure-rockylinux-1.27.11
----------------------------
Passed: 20
Failed: 0
=============================================================
=============================================================
Test results for: azure-ubuntu-1.27.11
----------------------------
Passed: 20
Failed: 0
=============================================================
=============================================================
Test results for: azure-ubuntu-1.28.7
----------------------------
Passed: 20
Failed: 0
=============================================================

========================== RESULT ===========================
Parameters:
  KKP Version............: v2.25.0-beta.2 (EE)
  Name Prefix............: "ronny"
  OSM Enabled............: true
  Dualstack Enabled......: false
  Konnectivity Enabled...: true
  Cluster Updates Enabled: true
  Enabled Tests..........: [gcr-images loadbalancer metrics securitycontext storage usercluster-gcr-images usercluster-rbac usercluster-seccomp]
  Scenario Options.......: []

Test results:
[FAIL] - azure-centos-1.27.11 (6m46s): failed to test cluster: failed waiting for control plane to become ready: Get "https://hzqfp9kg6v.captain.captain.k8c.io:31386/apis/kubermatic.k8c.io/v1/clusters/ronny-xkf4h": http2: client connection lost
[FAIL] - azure-centos-1.28.7 (42m52s): failed to test cluster: failed to wait for all pods to get ready: context deadline exceeded; last error was: not all Pods are ready: [data-writer-0]
[FAIL] - azure-flatcar-1.27.11 (1h8m41s): failed to test cluster: failed to wait for all nodes to be rotated: context deadline exceeded; last error was: failed to list nodes: client rate limiter Wait returned an error: context deadline exceeded
[FAIL] - azure-flatcar-1.28.7 (1h9m47s): failed to test cluster: failed to wait for all nodes to be rotated: context deadline exceeded; last error was: failed to list nodes: client rate limiter Wait returned an error: context deadline exceeded
[FAIL] - azure-rhel-1.27.11 (56m20s): failed to test cluster: failed to wait for all nodes to be rotated: context deadline exceeded; last error was: failed to list nodes: client rate limiter Wait returned an error: context deadline exceeded
[FAIL] - azure-rhel-1.28.7 (45m36s): failed to test cluster: failed to wait for all nodes to be rotated: context deadline exceeded; last error was: not all nodes rotated: [] are unready, [e2e-bd7mp-8488fdfdd4-rz9nr e2e-bd7mp-8488fdfdd4-scl6p e2e-bd7mp-8488fdfdd4-xp6rn] are outdated
[FAIL] - azure-rockylinux-1.27.11 (1h17m37s): failed to test cluster: failed to wait for all nodes to be rotated: context deadline exceeded; last error was: not all nodes rotated: [] are unready, [e2e-zpncj-64648c7548-n6lpp e2e-zpncj-64648c7548-s52c6 e2e-zpncj-64648c7548-wmdnk] are outdated
[FAIL] - azure-rockylinux-1.28.7 (48m25s): failed to test cluster: failed to wait for all nodes to be rotated: context deadline exceeded; last error was: not all nodes rotated: [e2e-d2wzs-7966c8d8fd-9w6ps] are unready, [] are outdated
[FAIL] - azure-ubuntu-1.27.11 (1h7m57s): failed to test cluster: failed to wait for all nodes to be rotated: context deadline exceeded; last error was: failed to list nodes: client rate limiter Wait returned an error: context deadline exceeded
[FAIL] - azure-ubuntu-1.28.7 (1h7m39s): failed to test cluster: failed to wait for all nodes to be rotated: context deadline exceeded; last error was: failed to list nodes: client rate limiter Wait returned an error: context deadline exceeded
ronissac88 commented 8 months ago

Tried the (azure) conformance test with UPDATE=true multiple times. Hitting the same issue. Without UPDATE=true, all tests pass.

========================== RESULT ===========================
Parameters:
  KKP Version............: v2.25.0-beta.3 (EE)
  Name Prefix............: "ronny"
  OSM Enabled............: true
  Dualstack Enabled......: false
  Konnectivity Enabled...: true
  Cluster Updates Enabled: false
  Enabled Tests..........: [gcr-images loadbalancer metrics securitycontext storage usercluster-gcr-images usercluster-rbac usercluster-seccomp]
  Scenario Options.......: []

Test results:
[ OK ] - azure-centos-1.27.11 (26m21s)
[ OK ] - azure-centos-1.28.7 (25m30s)
[ OK ] - azure-centos-1.29.2 (27m15s)
[ OK ] - azure-flatcar-1.27.11 (28m7s)
[ OK ] - azure-flatcar-1.28.7 (28m33s)
[ OK ] - azure-flatcar-1.29.2 (29m11s)
[ OK ] - azure-rhel-1.27.11 (30m59s)
[ OK ] - azure-rhel-1.28.7 (28m47s)
[ OK ] - azure-rhel-1.29.2 (29m49s)
[ OK ] - azure-rockylinux-1.27.11 (30m56s)
[ OK ] - azure-rockylinux-1.28.7 (29m33s)
[ OK ] - azure-rockylinux-1.29.2 (27m27s)
[ OK ] - azure-ubuntu-1.27.11 (25m2s)
[ OK ] - azure-ubuntu-1.28.7 (26m13s)
[ OK ] - azure-ubuntu-1.29.2 (25m1s)
2024-03-08T22:23:59.251+0530    info    conformance-tester/main.go:190  Test suite has completed successfully   {"runtime": "3h35m28.358511666s"}
embik commented 7 months ago

The upgrade is failing most likely because of #12419. While this has to be fixed, it's not a new issue in 2.25, so I will go ahead and close this issue. We need to follow up on this though. cc @csengerszabo