Open pacoxu opened 8 months ago
I looked into this today for sig-windows. It appears the test fails when it cannot reach the API server. I can only find one instance of it failing for sig-windows main job, the hyper-v jobs have known networking issues and hence the failure.
It may be a timing issues since we are hitting this block https://github.com/kubernetes/kubernetes/blob/634fc1b4836b3a500e0d715d71633ff67690526a/test/e2e/apimachinery/crd_conversion_webhook.go#L499-L502
Looking at the other non-windows failures it seems like mostly occurs with many test failures where the API Server is not reachable. As an example:
https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/ci-kubernetes-local-e2e/1764995137261277184
The call to Create
got stuck and only made a single call instead of many over the 30 second timeout. See the logs:
0310 22:08:01.317904 2463 util.go:506] >>> kubeConfig: /home/prow/go/src/sigs.k8s.io/windows-testing/capz/capz-conf-y1unem.kubeconfig
I0310 22:08:31.630253 2463 crd_conversion_webhook.go:501] error waiting for conversion to succeed during setup: Post "https://capz-conf-y1unem-4d42d29d.uksouth.cloudapp.azure.com:6443/apis/stable.example.com/v2/namespaces/crd-webhook-5821/e2e-test-crd-webhook-7762-crds": context deadline exceeded
I0310 22:08:31.630417 2463 crd_conversion_webhook.go:486] Unexpected error:
<context.deadlineExceededError>:
context deadline exceeded
{}
https://github.com/kubernetes/kubernetes/blob/634fc1b4836b3a500e0d715d71633ff67690526a/test/e2e/apimachinery/crd_conversion_webhook.go#L497 and the whole block timed out
/assign @jsturtevant Could you continue working on this issue? Thank you. /triage accepted
Which jobs are flaking?
https://storage.googleapis.com/k8s-triage/index.html?test=should%20be%20able%20to%20convert%20a%20non%20homogeneous%20list%20of%20CRs&xjob=calico
Which tests are flaking?
Since when has it been flaking?
storage.googleapis.com shows it flaked for a long period.
Testgrid link
https://testgrid.k8s.io/sig-release-master-informing#capz-windows-master
Reason for failure (if possible)
https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/ci-kubernetes-e2e-capz-master-windows/1766944844321656832
Anything else we need to know?
some issues that may be related: https://github.com/kubernetes/kubernetes/issues/93705
Relevant SIG(s)
/sig api-machinery /sig windows see this in a windows ci board, but may not be related. Add the sig for triage.