karmada-io / karmada

Open, Multi-Cloud, Multi-Cluster Kubernetes Orchestration
https://karmada.io
Apache License 2.0
4.37k stars 865 forks source link

flake test: the resource selector scope of some Policies in E2E use case is too large to affect other use cases. #5256

Closed chaosi-zju closed 1 month ago

chaosi-zju commented 1 month ago

What happened:

The resource selector scope of some Policies in E2E use case is too large to affect other use cases.

take below case as an example:

https://github.com/karmada-io/karmada/blob/1430ecbd2eb7e8321724ed9b6b08436b9b0b051f/test/e2e/propagationpolicy_test.go#L558-L568

this policy matches all deployment, however, e2e use cases are running in parallel, it may be result in other case failure.

What you expected to happen:

e2e use cases do not affect each other

How to reproduce it (as minimally and precisely as possible):

I will list relevant failures in the comments below

Anything else we need to know?:

Environment:

chaosi-zju commented 1 month ago

https://github.com/karmada-io/karmada/actions/runs/10105133344/job/27945301974?pr=5247

Propagate the deployment with the high-priority PropagationPolicy and then reduce it's priority to be preempted by the low-priority PropagationPolicy**

• [FAILED] [420.115 seconds]
[Preemption] propagation policy preemption testing when [PropagationPolicy Preemption] PropagationPolicy preempts another (Cluster)PropagationPolicy High-priority PropagationPolicy reduces priority to be preempted by low-priority PropagationPolicy [It] Propagate the deployment with the high-priority PropagationPolicy and then reduce it's priority to be preempted by the low-priority PropagationPolicy
/home/runner/work/karmada/karmada/test/e2e/preemption_test.go:202

  Captured StdOut/StdErr Output >>
  I0726 03:36:21.917913   48027 deployment.go:75] Waiting for deployment(karmadatest-pk2zr/deploy-46phc) synced on cluster(member1)
  << Captured StdOut/StdErr Output

  Timeline >>
  STEP: Creating Deployment(karmadatest-pk2zr/deploy-46phc) @ 07/26/24 03:36:21.866
  STEP: Creating PropagationPolicy(karmadatest-pk2zr/deploy-46phchigh-pp) @ 07/26/24 03:36:21.875
  STEP: Creating PropagationPolicy(karmadatest-pk2zr/deploy-46phclow-pp) @ 07/26/24 03:36:21.89
  STEP: Wait for propagating deployment by the high-priority PropagationPolicy @ 07/26/24 03:36:21.917
  [FAILED] in [It] - /home/runner/work/karmada/karmada/test/e2e/framework/deployment.go:82 @ 07/26/24 03:43:21.919
  STEP: Removing Deployment(karmadatest-pk2zr/deploy-46phc) @ 07/26/24 03:43:21.919
  STEP: Removing PropagationPolicy(karmadatest-pk2zr/deploy-46phclow-pp) @ 07/26/24 03:43:21.964
  STEP: Removing PropagationPolicy(karmadatest-pk2zr/deploy-46phchigh-pp) @ 07/26/24 03:43:21.972
  << Timeline

  [FAILED] Timed out after 420.000s.
  Expected
      <bool>: false
  to equal
      <bool>: true
  In [It] at: /home/runner/work/karmada/karmada/test/e2e/framework/deployment.go:82 @ 07/26/24 03:43:21.919

  Full Stack Trace
    github.com/karmada-io/karmada/test/e2e/framework.WaitDeploymentPresentOnClusterFitWith({0xc0007d2959, 0x7}, {0xc000163f08, 0x11}, {0xc0003c72a4, 0xc}, 0x4336d68)
        /home/runner/work/karmada/karmada/test/e2e/framework/deployment.go:82 +0x614
    github.com/karmada-io/karmada/test/e2e.init.func34.2.3.3.1()
        /home/runner/work/karmada/karmada/test/e2e/preemption_test.go:204 +0xdc
    github.com/karmada-io/karmada/test/e2e.init.func34.2.3.3()
        /home/runner/work/karmada/karmada/test/e2e/preemption_test.go:203 +0xd9

check log of controller-manager:

2024-07-26T03:36:21.883871788Z stderr F I0726 03:36:21.883785       1 detector.go:252] Reconciling object: apps/v1, kind=Deployment, karmadatest-pk2zr/deploy-46phc
2024-07-26T03:36:21.884180496Z stderr F I0726 03:36:21.884120       1 detector.go:532] Applying cluster policy(deploy-match-allb7rtn) for object: apps/v1, kind=Deployment, karmadatest-pk2zr/deploy-46phc

the deployment is expected to be matched by karmadatest-pk2zr/deploy-46phchigh-pp, however, it is actually matched by deploy-match-allb7rtn, which is created by another e2e case:

[ImplicitPriority] propagation testing priorityMatchName/priorityMatchLabel/priorityMatchAll propagation testing priorityMatchName/priorityMatchLabel/priorityMatchAll testing
/home/runner/work/karmada/karmada/test/e2e/clusterpropagationpolicy_test.go:619

 .....

  Timeline >>
  STEP: Creating ClusterPropagationPolicy(deploy-match-namev589z) @ 07/26/24 03:36:03.63
  STEP: Creating ClusterPropagationPolicy(deploy-match-labelselector54mp4) @ 07/26/24 03:36:03.794
  STEP: Creating ClusterPropagationPolicy(deploy-match-allb7rtn) @ 07/26/24 03:36:03.861
  STEP: Creating Deployment(karmadatest-lnslk/deploy-tr9tj) @ 07/26/24 03:36:07.043
  STEP: check whether the deployment uses the highest priority propagationPolicy(priorityMatchName) @ 07/26/24 
......
chaosi-zju commented 1 month ago

https://github.com/karmada-io/karmada/actions/runs/10074355448/job/27850551542

Simple Case 2 (Policy created after resource)

• [FAILED] [431.127 seconds]
Lazy activation policy testing 2. Policy created after resource [It] Simple Case 2 (Policy created after resource)
/home/runner/work/karmada/karmada/test/e2e/lazy_activation_policy_test.go:268

  Captured StdOut/StdErr Output >>
  I0724 10:05:24.191590   48162 deployment.go:123] Waiting for deployment(karmadatest-tnr4l/deploy-8fj5q) disappears on cluster(member1)
  I0724 10:12:24.238174   48162 deployment.go:123] Waiting for deployment(karmadatest-tnr4l/deploy-8fj5q) disappears on cluster(member1)
  << Captured StdOut/StdErr Output

  Timeline >>
  STEP: Creating Deployment(karmadatest-tnr4l/deploy-8fj5q) @ 07/24/24 10:05:18.121
  STEP: Creating PropagationPolicy(karmadatest-tnr4l/deploy-8fj5q) @ 07/24/24 10:05:21.177
  STEP: step1: deployment would not propagate when lazy policy created after deployment @ 07/24/24 10:05:21.191
  [FAILED] in [It] - /home/runner/work/karmada/karmada/test/e2e/framework/deployment.go:135 @ 07/24/24 10:12:24.193
  STEP: Removing PropagationPolicy(karmadatest-tnr4l/deploy-8fj5q) @ 07/24/24 10:12:24.193
  STEP: Removing Deployment(karmadatest-tnr4l/deploy-8fj5q) @ 07/24/24 10:12:24.234
  << Timeline

  [FAILED] Timed out after 420.001s.
  Expected
      <bool>: false
  to equal
      <bool>: true
  In [It] at: /home/runner/work/karmada/karmada/test/e2e/framework/deployment.go:135 @ 07/24/24 10:12:24.193

check log of controller-manager:

2024-07-24T10:05:18.180864723Z stderr F I0724 10:05:18.180761       1 detector.go:252] Reconciling object: apps/v1, kind=Deployment, karmadatest-tnr4l/deploy-8fj5q
2024-07-24T10:05:18.188824163Z stderr F I0724 10:05:18.188710       1 detector.go:377] Attempts to match policy for resource(apps/v1, kind=Deployment, karmadatest-tnr4l/deploy-8fj5q)
2024-07-24T10:05:18.188948234Z stderr F I0724 10:05:18.188867       1 detector.go:384] No propagationpolicy find in namespace(karmadatest-tnr4l).
2024-07-24T10:05:18.189040006Z stderr F I0724 10:05:18.188971       1 detector.go:408] Attempts to match cluster policy for resource(apps/v1, kind=Deployment, karmadatest-tnr4l/deploy-8fj5q)
2024-07-24T10:05:18.189391041Z stderr F I0724 10:05:18.189339       1 compare.go:100] Matched cluster policy(deploy-match-all5zhz5) for resource(apps/v1, kind=Deployment, karmadatest-tnr4l/deploy-8fj5q)

the deployment is expected to be matched by policy karmadatest-tnr4l/deploy-8fj5q, however, it is actually matched by deploy-match-all5zhz5, which is created by another e2e case:

[ImplicitPriority] propagation testing priorityMatchName/priorityMatchLabel/priorityMatchAll propagation testing priorityMatchName/priorityMatchLabel/priorityMatchAll testing
/home/runner/work/karmada/karmada/test/e2e/clusterpropagationpolicy_test.go:619

......

  Timeline >>
  STEP: Creating ClusterPropagationPolicy(deploy-match-namejtwbv) @ 07/24/24 10:05:08.431
  STEP: Creating ClusterPropagationPolicy(deploy-match-labelselectort5bqv) @ 07/24/24 10:05:08.617
  STEP: Creating ClusterPropagationPolicy(deploy-match-all5zhz5) @ 07/24/24 10:05:08.682
  STEP: Creating Deployment(karmadatest-cbqr5/deploy-dggtr) @ 07/24/24 10:05:11.72
  STEP: check whether the deployment uses the highest priority propagationPolicy(priorityMatchName) @ 07/24/24 10:05:11.743
......
XiShanYongYe-Chang commented 1 month ago

Thanks for your analyze! Do we need to take it as a bug or a flaking-test?

chaosi-zju commented 1 month ago

/remove-kind bug /flake

chaosi-zju commented 1 month ago

/kind flake