kubernetes-retired / cluster-api-provider-nested

Cluster API Provider for Nested Clusters
Apache License 2.0
301 stars 67 forks source link

🐛 Resource already exists and the UID is different should not requeue #295

Closed wondywang closed 2 years ago

wondywang commented 2 years ago

What this PR does / why we need it: If the queue is not dequeued so early, the abnormal tasks will continue to accumulate until they exceed the maximum number of retries MaxReconcileRetryAttempts will be discarded.

So, when the DWS synchronization resource is encountered but its delegated object UID is different in the reconcileXXXCreate process, it should not re-enter the queue, because the event will also not be processed in the subsequent process.

Which issue(s) this PR fixes: Fixes #293

k8s-ci-robot commented 2 years ago

Hi @wondywang. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
k8s-ci-robot commented 2 years ago

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: wondywang Once this PR has been reviewed and has the lgtm label, please assign charleszheng44 for approval by writing /assign @charleszheng44 in a comment. For more information see:The Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files: - **[virtualcluster/OWNERS](https://github.com/kubernetes-sigs/cluster-api-provider-nested/blob/main/virtualcluster/OWNERS)** Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
Fei-Guo commented 2 years ago

@wondywang I think it is ok to have retry if UID mismatch, otherwise if user deletes->creates the resource with the same name in VC and the order happens is create->delete in the super cluster, the new resource may take 1 min to be created in super after the patroller clear the stale one in the super cluster.

Also, this change does not resolve the problem that the patroller will delete the Kube-root-ca cfgmap in the super cluster periodically (although the cfgmap will be created automatically again). Note that Issue #293 and #282 are related since the Kube-root-ca is part of the projected service account. So here is my suggestion:

1) Add a feature gate for supporting projected service account; 2) As a temporal solution, in cfgmap DWS, add a whitelist to skip the cfgmap with name "Kube-root-ca" (using the feature gate to provide backward compatibility since user may create a cfgmap with this name before 1.20); 3) Also change the cfgmap patroller to skip the "Kube-root-ca" cfgmap;

Later, we can add code to fully support the projected service account. What do you think?

wondywang commented 2 years ago

Thanks @Fei-Guo , now I know I the retry is neede. Meanwhile, a whitelist may be a good temporal solution for "kube-root-ca". Then, in the end, we have to plan how to support the projected service account.

So, I will close this PR.