kubernetes / kubernetes

Production-Grade Container Scheduling and Management
https://kubernetes.io
Apache License 2.0
109.66k stars 39.27k forks source link

apf: test flake without detailed logs #97511

Closed BenTheElder closed 2 years ago

BenTheElder commented 3 years ago

What happened:

https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/97447/pull-kubernetes-e2e-kind/1342023270328700928

Kubernetes e2e suite: [sig-api-machinery] API priority and fairness should ensure that requests can be classified by testing flow-schemas/priority-levels expand_less | 2s
-- | --
test/e2e/apimachinery/flowcontrol.go:48 Dec 24 08:41:58.409: matching user doesnt received UID for the testing priority-level and flow-schema vendor/github.com/onsi/ginkgo/internal/leafnodes/runner.go:113

What you expected to happen:

This test should not flake and the logs should show me what did happen.

https://github.com/kubernetes/kubernetes/blob/88a05df5ff1b311e8e92f64b4f3a2c7d4329d14e/test/e2e/apimachinery/flowcontrol.go#L63

How to reproduce it (as minimally and precisely as possible):

Send a PR to Kubernetes. Run tests repeatedly. Enjoy flakes.

Anything else we need to know?:

Environment:

aojea commented 3 years ago

This looks a duplicate of https://github.com/kubernetes/kubernetes/issues/96803

bl-ue commented 3 years ago

/sig api-machinery (based on sig of above references issue)

fedebongio commented 3 years ago

/assign @lavalamp /triage accepted

BenTheElder commented 3 years ago

https://github.com/kubernetes/kubernetes/issues/96803 had some different failures I think, commented there as well

lavalamp commented 3 years ago

My guess is we somehow have a race with the apiserver loading the new PL / FS.

BenTheElder commented 3 years ago

spotted again 1.20

https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/kubernetes-sigs_kind/2089/pull-kind-e2e-kubernetes-1-20/1364730432314150912

seems to be sporadic in HEAD and 1.20 at least: https://storage.googleapis.com/k8s-gubernator/triage/index.html?pr=1&text=matching%20user%20doesnt%20received

fejta-bot commented 3 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale

BenTheElder commented 3 years ago

/remove-lifecycle stale

These are still occurring. We have multiple instances today like:

/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/apimachinery/flowcontrol.go:48
May 21 13:27:41.970: matching user doesnt received UID for the testing priority-level and flow-schema
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/github.com/onsi/ginkgo/internal/leafnodes/runner.go:113

https://storage.googleapis.com/k8s-gubernator/triage/index.html?pr=1&text=matching%20user%20doesnt%20received

lavalamp commented 3 years ago

The failures appear to be only from 1.20. The code for the test is definitely very different now. Probably https://github.com/kubernetes/kubernetes/pull/96984 fixed the problem. I can't think of a reason not to backport it.

BenTheElder commented 3 years ago

Good catch re: 1.20

filed https://github.com/kubernetes/kubernetes/pull/102709, there were conflicts and I'm not certain yet that it's correct.

k8s-triage-robot commented 3 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-ci-robot commented 2 years ago

@k8s-triage-robot: Closing this issue.

In response to [this](https://github.com/kubernetes/kubernetes/issues/97511#issuecomment-962271321): >The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. > >This bot triages issues and PRs according to the following rules: >- After 90d of inactivity, `lifecycle/stale` is applied >- After 30d of inactivity since `lifecycle/stale` was applied, `lifecycle/rotten` is applied >- After 30d of inactivity since `lifecycle/rotten` was applied, the issue is closed > >You can: >- Reopen this issue or PR with `/reopen` >- Mark this issue or PR as fresh with `/remove-lifecycle rotten` >- Offer to help out with [Issue Triage][1] > >Please send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). > >/close > >[1]: https://www.kubernetes.dev/docs/guide/issue-triage/ Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.