Closed yue9944882 closed 2 years ago
/approve /lgtm
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: qiujian16, yue9944882
The full list of commands accepted by this bot can be found here.
The pull request process is described here
currently when we're initializing the controllers/informers for the spoke agent, the
namespacedManagementKubeInformerFactory
is both used by the ephemeral bootstrap controller and those long-running controller. And there we're startingnamespacedManagementKubeInformerFactory
twice in different places: (1) starting withbootstrapCtx
and (2) starting with backgroundctx
. For (1), thebootstrapCtx
will be explicitly closed after we finish the bootstrap process which means that the informers we started in (1) will also be stopped, but when we move on and try to start the informers ofnamespacedManagementKubeInformerFactory
in (2), the stopped informers will not be revived due to the following implementation of informer-factory:https://github.com/kubernetes/kubernetes/blob/657776e52ba527d8ff04187be4e8743e62909c07/staging/src/k8s.io/client-go/informers/factory.go#L141-L151
as a result, the spoke controllers relying
namespacedManagementKubeInformerFactory
will not be notified any watch events b/c of zombie informers. e.g. the hub-cert rotation controller cannot perceive any write event upon the watching secrets b/c the secret informer is down:https://github.com/open-cluster-management-io/registration/blob/main/pkg/clientcert/cert_controller.go#L199