Closed mmiranda96 closed 2 months ago
/kind failing-test /remove-kind flake
It keeps failing. The last success and the only success that I can see now is 11-07.
I found that this CI is in https://testgrid.k8s.io/sig-node-release-blocking#node-kubelet-serial-containerd which is sig-node release blocking CI. If this is release blocking, we should fix it ASAP. If not, we may move this CI to another board like https://testgrid.k8s.io/sig-node-containerd. /cc @SergeyKanzhelev @mrunalp
link the slack thread here: https://kubernetes.slack.com/archives/C0BP8PW9G/p1700553934108539
/cc
Device manager tests are failing because of the reconnection to socket error. Not a regression.
E2eNode Suite.[It] [sig-node] POD Resources [Serial] [Feature:PodResources][NodeFeature:PodResources] with the builtin rate limit values should hit throttling when calling podresources List in a tight loop
Another known issue, not a regression. Flakes from the nature of test how it validates the throttling logic
E2eNode Suite.[It] [sig-node] Density [Serial] [Slow] create a batch of pods latency/resource should be within limit when create 10 pods with 0s interval
Also not a regression, need to take a look after release
this PR wants to reduce/remove flakes: https://github.com/kubernetes/kubernetes/pull/122024
Latest test run failed only density tests and only for 2 nodes out of 3:
What's interesting is that succeeded node is configured similarly to the failed one, but runtime metrics are much better:
The only difference I can see is that one configuration requests 2 nvidia-tesla-k80 accelerators. I'm not sure if it's related to the density tests failures though.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale this is still a good umbrella for https://testgrid.k8s.io/sig-node-release-blocking#node-kubelet-serial-containerd
/retitle [Flaking Test] [sig-node] ☂️ node-kubelet-serial-containerd job multiple flakes🌂
/triage accepted
sig-node CI meeting:
All child bugs for flaky tests are closed.
/close
@AnishShah: Closing this issue.
Which jobs are flaking?
node-kubelet-serial-containerd
Which tests are flaking?
There are multiple tests:
Since when has it been flaking?
Flakes have been present for a while.
Testgrid link
https://testgrid.k8s.io/sig-node-release-blocking#node-kubelet-serial-containerd
Reason for failure (if possible)
No response
Anything else we need to know?
We run each test multiple times (3). In most cases it's only one of them that fails. This might not be a critical issue, but ideally we want a green Testgrid.
Relevant SIG(s)
/sig node