kubernetes-sigs / kueue

Kubernetes-native Job Queueing
https://kueue.sigs.k8s.io
Apache License 2.0
1.47k stars 262 forks source link

[Flaky E2E] Deployment should admit workloads after change queue-name if AvailableReplicas = 0 #3626

Closed mbobrovskyi closed 4 hours ago

mbobrovskyi commented 10 hours ago

What happened: End To End Suite: kindest/node:v1.28.9: [It] Deployment should admit workloads after change queue-name if AvailableReplicas = 0

{Expected success, but got an error:
    <*errors.StatusError | 0xc000996960>: 
    workloads.kueue.x-k8s.io "pod-deployment-6d47d84db6-89fv2-c7e95" not found
    {
        ErrStatus: {
            TypeMeta: {Kind: "", APIVersion: ""},
            ListMeta: {
                SelfLink: "",
                ResourceVersion: "",
                Continue: "",
                RemainingItemCount: nil,
            },
            Status: "Failure",
            Message: "workloads.kueue.x-k8s.io \"pod-deployment-6d47d84db6-89fv2-c7e95\" not found",
            Reason: "NotFound",
            Details: {
                Name: "pod-deployment-6d47d84db6-89fv2-c7e95",
                Group: "kueue.x-k8s.io",
                Kind: "workloads",
                UID: "",
                Causes: nil,
                RetryAfterSeconds: 0,
            },
            Code: 404,
        },
    } failed [FAILED] Expected success, but got an error:
    <*errors.StatusError | 0xc000996960>: 
    workloads.kueue.x-k8s.io "pod-deployment-6d47d84db6-89fv2-c7e95" not found
    {
        ErrStatus: {
            TypeMeta: {Kind: "", APIVersion: ""},
            ListMeta: {
                SelfLink: "",
                ResourceVersion: "",
                Continue: "",
                RemainingItemCount: nil,
            },
            Status: "Failure",
            Message: "workloads.kueue.x-k8s.io \"pod-deployment-6d47d84db6-89fv2-c7e95\" not found",
            Reason: "NotFound",
            Details: {
                Name: "pod-deployment-6d47d84db6-89fv2-c7e95",
                Group: "kueue.x-k8s.io",
                Kind: "workloads",
                UID: "",
                Causes: nil,
                RetryAfterSeconds: 0,
            },
            Code: 404,
        },
    }
In [It] at: /home/prow/go/src/sigs.k8s.io/kueue/test/e2e/singlecluster/deployment_test.go:202 @ 11/25/24 10:17:18.844
}

What you expected to happen: No errors.

How to reproduce it (as minimally and precisely as possible): https://prow.k8s.io/view/gs/kubernetes-ci-logs/pr-logs/pull/kubernetes-sigs_kueue/3615/pull-kueue-test-e2e-main-1-28/1860988860012433408

Anything else we need to know?:

Environment:

mimowo commented 9 hours ago

/assign @mbobrovskyi PTAL

mimowo commented 9 hours ago

@mbobrovskyi the test fails pretty often, if the fix is not simple I propose to rollback the previous PR and work on it more

mimowo commented 8 hours ago

another: https://prow.k8s.io/view/gs/kubernetes-ci-logs/pr-logs/pull/kubernetes-sigs_kueue/3630/pull-kueue-test-e2e-main-1-29/1861011169054035968