karmada-io / karmada

Open, Multi-Cloud, Multi-Cluster Kubernetes Orchestration
https://karmada.io
Apache License 2.0
4.24k stars 828 forks source link

CI Schedule Workflow Failed #3706

Open chaunceyjiang opened 1 year ago

chaunceyjiang commented 1 year ago

What happened: CI Schedule Workflow Failed.

https://github.com/karmada-io/karmada/actions/runs/5371093730

• [FAILED] [300.495 seconds]
[resource-status collection] resource status collection testing PodDisruptionBudget collection testing [It] pdb status collection testing
/home/runner/work/karmada/karmada/test/e2e/resource_test.go:586

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

LronDC commented 1 year ago

Can I assign this if there is a bug that needs to be fixed? I may need some work-through though.

RainbowMango commented 1 year ago

Sure, please help to figure out the root cause.

chaunceyjiang commented 1 year ago

/assign @LronDC

liangyuanpeng commented 9 months ago

image

https://github.com/karmada-io/karmada/actions/runs/6449153569/job/17507161218

[INTERRUPTED] in [It] - /home/runner/work/karmada/karmada/test/e2e/resource_test.go:425 @ 10/08/23 18:23:37.598
  STEP: Removing PropagationPolicy(karmadatest-m2h8w/daemonset-g5c7m) @ 10/08/23 18:23:37.61
  STEP: Removing DaemonSet(karmadatest-m2h8w/daemonset-g5c7m) @ 10/08/23 18:23:37.621
  << Timeline

  [INTERRUPTED] Interrupted by Other Ginkgo Process
  In [It] at: /home/runner/work/karmada/karmada/test/e2e/resource_test.go:425 @ 10/08/23 18:23:37.598

  Full Stack Trace
------------------------------
SSSSSSSSSSSSS
------------------------------
[SynchronizedAfterSuite] PASSED [0.065 seconds]
[SynchronizedAfterSuite] 
/home/runner/work/karmada/karmada/test/e2e/suite_test.go:140
------------------------------
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
------------------------------
[SynchronizedAfterSuite] PASSED [0.513 seconds]
[SynchronizedAfterSuite] 
/home/runner/work/karmada/karmada/test/e2e/suite_test.go:140
------------------------------

Summarizing 2 Failures:
  [FAIL] [karmada-search] karmada search testing reconcile ResourceRegistry when clusters joined, updated when cluster joined [BeforeAll] [member clusters joined] could reconcile ResourceRegistry
  /home/runner/work/karmada/karmada/test/e2e/search_test.go:750
  [INTERRUPTED] [resource-status collection] resource status collection testing DaemonSetStatus collection testing [It] daemonSet status collection testing
  /home/runner/work/karmada/karmada/test/e2e/resource_test.go:425

Ran 59 of 150 Specs in 406.810 seconds
FAIL! - Interrupted by Other Ginkgo Process -- 57 Passed | 2 Failed | 0 Pending | 91 Skipped

Ginkgo ran 1 suite in 10m18.927049597s

Test Suite Failed
Collect logs to /home/runner/work/karmada/karmada/karmada-e2e-logs/v1.25.0/...
Collecting karmada-host logs...
Exporting logs for cluster "karmada-host" to:
/home/runner/work/karmada/karmada/karmada-e2e-logs/v1.25.0//karmada-host
Collecting member3 logs...
Exporting logs for cluster "member3" to:
/home/runner/work/karmada/karmada/karmada-e2e-logs/v1.25.0//member3
Collected logs at /home/runner/work/karmada/karmada/karmada-e2e-logs/v1.25.0/:
total 32
drwxr-xr-x 5 runner docker  4096 Oct  8 18:23 .
drwxr-xr-x 3 runner docker  4096 Oct  8 18:12 ..
drwxr-xr-x 2 runner docker  4096 Oct  8 18:23 KARMADA_PULL_CLUSTER_NAME
drwxr-xr-x 3 runner docker  4096 Oct  8 18:23 karmada-host
-rw------- 1 runner docker 10066 Oct  8 18:23 karmada.config
drwxr-xr-x 3 runner docker  4096 Oct  8 18:23 member3
deployment.apps "karmada-interpreter-webhook-example" deleted
service "karmada-interpreter-webhook-example" deleted
serviceaccount "karmada-interpreter-webhook-example" deleted
namespace "metallb-system" deleted
customresourcedefinition.apiextensions.k8s.io "addresspools.metallb.io" deleted
customresourcedefinition.apiextensions.k8s.io "bfdprofiles.metallb.io" deleted
customresourcedefinition.apiextensions.k8s.io "bgpadvertisements.metallb.io" deleted
customresourcedefinition.apiextensions.k8s.io "bgppeers.metallb.io" deleted
customresourcedefinition.apiextensions.k8s.io "communities.metallb.io" deleted
customresourcedefinition.apiextensions.k8s.io "ipaddresspools.metallb.io" deleted
customresourcedefinition.apiextensions.k8s.io "l2advertisements.metallb.io" deleted
serviceaccount "controller" deleted
serviceaccount "speaker" deleted
role.rbac.authorization.k8s.io "controller" deleted
role.rbac.authorization.k8s.io "pod-lister" deleted
clusterrole.rbac.authorization.k8s.io "metallb-system:controller" deleted
clusterrole.rbac.authorization.k8s.io "metallb-system:speaker" deleted
rolebinding.rbac.authorization.k8s.io "controller" deleted
rolebinding.rbac.authorization.k8s.io "pod-lister" deleted
clusterrolebinding.rbac.authorization.k8s.io "metallb-system:controller" deleted
clusterrolebinding.rbac.authorization.k8s.io "metallb-system:speaker" deleted
secret "webhook-server-cert" deleted
service "webhook-service" deleted
deployment.apps "controller" deleted
daemonset.apps "speaker" deleted
configmap/kube-proxy configured
resourceinterpreterwebhookconfiguration.config.karmada.io "examples" deleted
customresourcedefinition.apiextensions.k8s.io "workloads.workload.example.io" deleted
customresourcedefinition.apiextensions.k8s.io "workloads.workload.example.io" deleted
customresourcedefinition.apiextensions.k8s.io "workloads.workload.example.io" deleted
customresourcedefinition.apiextensions.k8s.io "workloads.workload.example.io" deleted
Error: Process completed with exit code 1.

Are you still here? @LronDC

RainbowMango commented 9 months ago

Does anyone know how we can get notifications for failing workflow?

LronDC commented 9 months ago

If I remember correctly, the same problem didn't appear again when I looked at this problem, so I put it on hold at that time.

Are you still here? @LronDC

And I'm so sorry that I can't focus on this right now. :(

/unassign

liangyuanpeng commented 9 months ago

@RainbowMango

Maybe we can create a new slack channel for schedule github workflow notification. It need the org admin to update permissions of slack app.

image

81b58760e64d0b48a3ec5d37fecb845

https://github.com/integrations/slack#workflow-notification-filters

Just work for schedule workflow: /github subscribe karmada-io/karmada workflows:{name:"CI Schedule Workflow" event:"schedule" branch:"master"} (not sure, have not test yet.)

I'm doing some test for it.

It's not support only work for failed workflow yet:

Other way with slack is update the github action to send notification when workflow is failed:

jobs:
...
  on-failure:
    runs-on: ubuntu-latest
    if: github.event.workflow_run.conclusion == 'failure' || github.event.workflow_run.conclusion == 'timed_out'
    steps:
      - uses: ravsamhq/notify-slack-action@v2
        with:
...