karmada-io / karmada

Open, Multi-Cloud, Multi-Cluster Kubernetes Orchestration
https://karmada.io
Apache License 2.0
4.11k stars 805 forks source link

fix: The PULL mode cluster cannot be deleted immediately #3998

Closed JadeFlute0127 closed 6 months ago

JadeFlute0127 commented 8 months ago

What type of PR is this?

What this PR does / why we need it:
Avoid removing clusters that are not ready or in an unknown state in pull mode, resulting in blockages and stuck failures. When the syncmode mode is pull and the cluster is not ready or in an unknown state, delete the finalizers of the work object directly

Which issue(s) this PR fixes: Fixes #1847

Special notes for your reviewer: @lonelyCZ

Does this PR introduce a user-facing change?:

None
JadeFlute0127 commented 8 months ago

The simple test effect is shown in the figure, where we can smoothly delete the cluster in the pull mode of the unknown state by minimizing code modifications without the need to modify more code or add new components. 8a231149ed3f3fd8d215c21ba88aa6e

codecov-commenter commented 8 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (e8b1720) 52.93% compared to head (7cef5b8) 52.87%. Report is 2 commits behind head on master.

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #3998 +/- ## ========================================== - Coverage 52.93% 52.87% -0.06% ========================================== Files 239 239 Lines 23526 23512 -14 ========================================== - Hits 12453 12433 -20 - Misses 10398 10403 +5 - Partials 675 676 +1 ``` | [Flag](https://app.codecov.io/gh/karmada-io/karmada/pull/3998/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=karmada-io) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/karmada-io/karmada/pull/3998/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=karmada-io) | `52.87% <ø> (-0.06%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=karmada-io#carryforward-flags-in-the-pull-request-comment) to find out more. [see 3 files with indirect coverage changes](https://app.codecov.io/gh/karmada-io/karmada/pull/3998/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=karmada-io)

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

JadeFlute0127 commented 8 months ago

@lonelyCZ The preliminary effect is shown in the figure above. Could you provide some suggestions?

lonelyCZ commented 8 months ago

I will look it ASAP.

/assign

lonelyCZ commented 8 months ago

What do you think of this solution?

/cc @RainbowMango @XiShanYongYe-Chang

lonelyCZ commented 7 months ago

Hi, @JadeFlute0127 , please rebase this pr, we can push it forward.

lonelyCZ commented 7 months ago

I just tested it in my env and it worked as expected!

image

/lgtm

cc @XiShanYongYe-Chang @RainbowMango

XiShanYongYe-Chang commented 7 months ago

Hi @lonelyCZ Sorry to respond late, I will come later.

RainbowMango commented 6 months ago

Echo the CI logs here: https://github.com/karmada-io/karmada/actions/runs/6364565813/job/17280988635?pr=3998#step:5:2382

Karmadactl promote testing Test promoting namespaced resource: service Test promoting a service from cluster member
Error: -30T18:16:36.0682023-09-30T18:16:36.3850151Z ##[error]No space left on device : '/home/runner/runners/2.309.0/_diag/pages/3c4f99bb-30d5-4a8c-8235-8d44763e0036_20338726-f59a-5af9-b649-883abbdf9d6e_1.log'

[edit]: re-triggered the tests.

JadeFlute0127 commented 6 months ago

CI logs seem to indicate insufficient testing server space


[error]No space left on device
XiShanYongYe-Chang commented 6 months ago

CI logs seem to indicate insufficient testing server space

[error]No space left on device

Hi @JadeFlute0127, we have resolved this error, can you help rebase the master branch and push again?

JadeFlute0127 commented 6 months ago

CI logs seem to indicate insufficient testing server space

[error]No space left on device

Hi @JadeFlute0127, we have resolved this error, can you help rebase the master branch and push again?

ok, i will rebase this patch as soon as possible

lonelyCZ commented 6 months ago

/lgtm

RainbowMango commented 6 months ago

I'm trying to figure out the reason of lint test:

golangci/golangci-lint info installed /home/runner/go/bin/golangci-lint
pkg/controllers/cluster/cluster_controller.go:34[8](https://github.com/karmada-io/karmada/actions/runs/6585180935/job/17891123923?pr=3998#step:5:9): File is not `gci`-ed with --skip-generated -s Standard,Default,Prefix(github.com/karmada-io/karmada) (gci)
            errors = append(errors,fmt.Errorf("error while removing finalizers of works %s: %v", work.Name, err))

I can't see anything wrong with the imported packages.

JadeFlute0127 commented 6 months ago

I'm trying to figure out the reason of lint test:

golangci/golangci-lint info installed /home/runner/go/bin/golangci-lint
pkg/controllers/cluster/cluster_controller.go:34[8](https://github.com/karmada-io/karmada/actions/runs/6585180935/job/17891123923?pr=3998#step:5:9): File is not `gci`-ed with --skip-generated -s Standard,Default,Prefix(github.com/karmada-io/karmada) (gci)
          errors = append(errors,fmt.Errorf("error while removing finalizers of works %s: %v", work.Name, err))

I can't see anything wrong with the imported packages.

Yeah, I compared other files in Karmada that also use this code and did not detect any warnings or errors. This makes me a bit confused.

RainbowMango commented 6 months ago

We met a similar issue today at #4157, that probably due to a gofmt issue.

diff pkg/controllers/cluster/cluster_controller.go.orig pkg/controllers/cluster/cluster_controller.go
--- pkg/controllers/cluster/cluster_controller.go.orig
+++ pkg/controllers/cluster/cluster_controller.go
@@ -345,7 +345,7 @@
        work := &workList.Items[i]
        err = c.removeWorkFinalizer(work)
        if err != nil {
-           errors = append(errors,fmt.Errorf("error while removing finalizers of works %s: %v", work.Name, err))
+           errors = append(errors, fmt.Errorf("error while removing finalizers of works %s: %v", work.Name, err))
        }
    }
    return utilerrors.NewAggregate(errors)

Can you help to resolve the gofmt issue and try again?

JadeFlute0127 commented 6 months ago

We met a similar issue today at #4157, that probably due to a gofmt issue.

diff pkg/controllers/cluster/cluster_controller.go.orig pkg/controllers/cluster/cluster_controller.go
--- pkg/controllers/cluster/cluster_controller.go.orig
+++ pkg/controllers/cluster/cluster_controller.go
@@ -345,7 +345,7 @@
      work := &workList.Items[i]
      err = c.removeWorkFinalizer(work)
      if err != nil {
-         errors = append(errors,fmt.Errorf("error while removing finalizers of works %s: %v", work.Name, err))
+         errors = append(errors, fmt.Errorf("error while removing finalizers of works %s: %v", work.Name, err))
      }
  }
  return utilerrors.NewAggregate(errors)

Can you help to resolve the gofmt issue and try again?

Okay, I will handle it as soon as possible

RainbowMango commented 6 months ago

Take your time. Happy weekend then!

RainbowMango commented 6 months ago

Hi @JadeFlute0127 Just an update: after fixing the gofmt issue, the' gci' problem was also solved in #4157.

RainbowMango commented 6 months ago

Hi @chaunceyjiang Please take another look.

JadeFlute0127 commented 6 months ago

@chaunceyjiang Hi, Could you please help me check again so that we can proceed with the merge.

karmada-bot commented 6 months ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: RainbowMango

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[pkg/controllers/OWNERS](https://github.com/karmada-io/karmada/blob/master/pkg/controllers/OWNERS)~~ [RainbowMango] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
wm775825 commented 5 months ago

Why only remove finalizers for pull-mode clusters? If karmada lost connection to push-mode clusters, will "karmadactl unjoin" stuck?

RainbowMango commented 5 months ago

That's a good question.

If karmada lost connection to push-mode clusters, will "karmadactl unjoin" stuck?

Yes, I guess. @wm775825 Can you help run a test to confirm it?