Open sowmyav27 opened 1 year ago
The aks certification fails occasionally with this error: AssertionError: Timed out waiting for pods in workload default-71758. Expected 1. Got 3
it's flaky should be looked at more closely
The az cert tests occasionally fails with this error:
p_client = <rancher.Client object at 0x7fef6976d0d0>
workload = {'actions': {'redeploy': 'https://<rancher-server>/v3/project/c-pp5zw:p-km2vv/workloads/daemonset:test-1...ls': {'cattle.io/creator': 'norman', 'workload.user.cattle.io/workloadselector': 'daemonSet-test-13740-default-50938'}}
timeout = 600
def get_endpoint_url_for_workload(p_client, workload, timeout=600):
fqdn_available = False
url = ""
start = time.time()
while not fqdn_available:
if time.time() - start > timeout:
raise AssertionError(
> "Timed out waiting for endpoint to be available")
E AssertionError: Timed out waiting for endpoint to be available
tests/v3_api/common.py:800: AssertionError
we may need to increase the timeout, but I am not sure. Just want to update so we can still continue on our flaky test actions.
Ontag import k3s certification is flaky with the ingress tests, usually need to run it twice to get it to pass.
I don't see this written, but the idea was to have this job also run the v2 eks rather than v1, so that is another update that is needed.
ec2 tests are failing because of attempting to provision with a in tree cloud provider on a 1.27 cluster.
Latest failures on 2.7x rancher
AZ - fails one time, but when re-run it passes. (There should be no failures. Review why there is a failure.)
Custom RKE1 cluster - tests.v3_api.test_rke_cluster_provisioning.test_rke_custom_host_2
-
rancher.ApiError: (ApiError(...), 'ServerError : Get "https://<>:6443/api/v1/namespaces/kube-public/replicationcontrollers?timeout=45s": tunnel disconnect\n\t{\'baseType\': \'error\', \'code\': \'ServerError\', \'message\': \'Get "https://<>:6443/api/v1/namespaces/kube-public/replicationcontrollers?timeout=45s": tunnel disconnect\', \'status\': 500, \'type\': \'error\'}')
rancher_ontag_go_certification -->
Test Result (6 failures / +6)
github.com/rancher/rancher/tests/v2/validation/provisioning/k3s.TestK3SProvisioningTestSuite/TestProvisioningK3SCluster/1_Node_all_roles_Admin_User_Node_Provider:_azure_Kubernetes_version:_v1.26.11+k3s2_cni:_calico
github.com/rancher/rancher/tests/v2/validation/provisioning/k3s.TestK3SProvisioningTestSuite/TestProvisioningK3SCluster
github.com/rancher/rancher/tests/v2/validation/provisioning/k3s.TestK3SProvisioningTestSuite
github.com/rancher/rancher/tests/v2/validation/provisioning/rke2.TestRKE2ProvisioningTestSuite/TestProvisioningRKE2Cluster/1_Node_all_roles_Admin_User_Node_Provider:_azure_Kubernetes_version:_v1.26.11+rke2r1_cni:_calico
github.com/rancher/rancher/tests/v2/validation/provisioning/rke2.TestRKE2ProvisioningTestSuite/TestProvisioningRKE2Cluster
github.com/rancher/rancher/tests/v2/validation/provisioning/rke2.TestRKE2ProvisioningTestSuite
EKS --> Cleaned up VPCs and re-ran the job
Exception: Timeout waiting for cluster to satisfy condition: lambda x: x.state == "active",
E State is: provisioning
Note: EKS ontag is using v1 version. This job needs to be skipped.
EKS clusters should have ingress tests skipped.
Import RKE1 cluster -
Exception: Timeout waiting for cluster to satisfy condition: lambda x: x.state == "active",
E State is: pending
Import k3s - @anupama2501 to work on fixing this
tests.v3_api.test_workload.test_wl_with_nodePort
tests.v3_api.test_workload.test_wl_with_nodePort_scale_and_upgrade
Following jobs passed successfully.
Following job fails.
Error syncing load balancer: failed to ensure load balancer: error authorizing security group ingress: "RulesPerSecurityGroupLimitExceeded: The maximum number of rules per security group has been reached.\n\tstatus code: 400, request id: b4013396-a8ca-4df3-9118-0ae1c2b43422"