karmada-io / karmada

Open, Multi-Cloud, Multi-Cluster Kubernetes Orchestration
https://karmada.io
Apache License 2.0
4.37k stars 865 forks source link

fix: `ClusterResourceBinding` scope in `MutatingWebhookConfiguration` #5252

Closed a7i closed 1 month ago

a7i commented 1 month ago

What type of PR is this? /kind bug

What this PR does / why we need it:

We've been having a lot of issues with karmada deleting cluster-level resources such as ClusterRole and ClusterRoleBinding.

After debugging, we realized that the label clusterresourcebinding.karmada.io/permanent-id was missing from those ClusterResourceBinding and that the webhook is responsible for populating that. Given that this was empty, all resource propagated to the member cluster also had an empty value for the id (i.e. clusterresourcebinding.karmada.io/permanent-id: "") which is then identified as orphaned work and is deleted in the cluster.

This must have broke after the "63 char limit issue" because all of our CRBs prior to that release are fine and do not cause orphan work issues:

(note 79 days ago is good, but all recent ones are not)

❯ kubectl get crb -L clusterresourcebinding.karmada.io/permanent-id

NAME                                                                                          SCHEDULED   FULLYAPPLIED   AGE     PERMANENT-ID
access-admin-clusterrole                                                                      True        True           150d    49baa1bb-c6e5-40fb-a913-3721314d736e
access-admin-clusterrolebinding                                                               True        True           150d    cfd8e53f-52fb-4613-8244-9a8084959b54
aggregate-olm-edit-clusterrole                                                                True        False          13d
aggregate-olm-view-clusterrole                                                                True        False          13d
aggregate-to-restricted-edit-clusterrole                                                      True        False          13d
amir-namespace                                                                                True        True           125d    fca50359-b88b-4647-bf33-9f25ffd82d02
analytics-admin-clusterrole                                                                   True        True           150d    ce2aefef-3251-45bf-8082-b816913b66cc
applications.argoproj.io-customresourcedefinition                                             True        True           79d     f5e35e32-20ac-4b2d-94b3-1e3a94e387f1
applicationsets.argoproj.io-customresourcedefinition                                          True        True           79d     d6f33b7d-1dd4-467d-b753-b60d0d70d473
appprojects.argoproj.io-customresourcedefinition                                              True        True           79d     50c92fea-e201-44fd-ba34-a717c90ed2e2

Which issue(s) this PR fixes: Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

fix: `ClusterResourceBinding` scope in `MutatingWebhookConfiguration`
a7i commented 1 month ago

/cc @jwcesign

would you be open to reviewing? seems like a regression from this

codecov-commenter commented 1 month ago

:warning: Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 28.26%. Comparing base (4ba18c1) to head (2530ab9).

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #5252 +/- ## ========================================== + Coverage 28.24% 28.26% +0.02% ========================================== Files 632 632 Lines 43732 43732 ========================================== + Hits 12353 12363 +10 + Misses 30476 30469 -7 + Partials 903 900 -3 ``` | [Flag](https://app.codecov.io/gh/karmada-io/karmada/pull/5252/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=karmada-io) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/karmada-io/karmada/pull/5252/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=karmada-io) | `28.26% <ø> (+0.02%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=karmada-io#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

a7i commented 1 month ago

/retest

XiShanYongYe-Chang commented 1 month ago

Hi @a7i, thanks for your feedback. I'm sorry for the disruption to your business.

This also reminds me that we do not have E2E capabilities to maintain related capabilities. Do you think we can design some E2E capabilities to maintain the logic?

a7i commented 1 month ago

Hi @a7i, thanks for your feedback. I'm sorry for the disruption to your business.

This also reminds me that we do not have E2E capabilities to maintain related capabilities. Do you think we can design some E2E capabilities to maintain the logic?

all good! we're happy that we can make small contributions to this project.

Happy to explore this and submit a few e2e tests in a separate PR (so we can cover CRB and RB)

karmada-bot commented 1 month ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: XiShanYongYe-Chang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[charts/OWNERS](https://github.com/karmada-io/karmada/blob/master/charts/OWNERS)~~ [XiShanYongYe-Chang] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment