open-cluster-management-io / registration

hub / spoke registration controllers
Apache License 2.0
42 stars 58 forks source link

Decouple bootstrap informers to distinguish each lifecycle #271

Closed yue9944882 closed 2 years ago

yue9944882 commented 2 years ago

currently when we're initializing the controllers/informers for the spoke agent, the namespacedManagementKubeInformerFactory is both used by the ephemeral bootstrap controller and those long-running controller. And there we're starting namespacedManagementKubeInformerFactory twice in different places: (1) starting with bootstrapCtx and (2) starting with background ctx. For (1), the bootstrapCtx will be explicitly closed after we finish the bootstrap process which means that the informers we started in (1) will also be stopped, but when we move on and try to start the informers of namespacedManagementKubeInformerFactory in (2), the stopped informers will not be revived due to the following implementation of informer-factory:

https://github.com/kubernetes/kubernetes/blob/657776e52ba527d8ff04187be4e8743e62909c07/staging/src/k8s.io/client-go/informers/factory.go#L141-L151

as a result, the spoke controllers relying namespacedManagementKubeInformerFactory will not be notified any watch events b/c of zombie informers. e.g. the hub-cert rotation controller cannot perceive any write event upon the watching secrets b/c the secret informer is down:

https://github.com/open-cluster-management-io/registration/blob/main/pkg/clientcert/cert_controller.go#L199

qiujian16 commented 2 years ago

/approve /lgtm

openshift-ci[bot] commented 2 years ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: qiujian16, yue9944882

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/open-cluster-management-io/registration/blob/main/OWNERS)~~ [qiujian16] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment