redhat-cop / resource-locker-operator

Apache License 2.0
30 stars 14 forks source link

Could not wait for Cache to sync #37

Closed BostjanBozic closed 3 years ago

BostjanBozic commented 3 years ago

Hello,

as of operator v0.1.5, it seems there are some issues with cache sync. I have a ResourceLocker specified for patching node labels as following:

apiVersion: redhatcop.redhat.io/v1alpha1
kind: ResourceLocker
metadata:
  name: node-label-patch
spec:
  patches:
  - id: master-0-label
    patchTemplate: |
      metadata:
        labels:
          datacenter: dc1
          nodetype: master
    patchType: application/strategic-merge-patch+json
    targetObjectRef:
      apiVersion: v1
      kind: Node
      name: master-0.cluster.com

What is happening since operator upgrade is issue with syncing cache in operator. Below is a sample error output:

{"level":"error","ts":1604488631.757469,"logger":"controller-runtime.controller","msg":"Could not wait for Cache to sync","controller":"controller_patchlocker_master-0-label","error":"failed to wait for controller_patchlocker_master-0-label caches to sync","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/home/travis/gopath/pkg/mod/github.com/go-logr/zapr@v0.1.1/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/home/travis/gopath/pkg/mod/sigs.k8s.io/controller-runtime@v0.6.0/pkg/internal/controller/controller.go:181\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start\n\t/home/travis/gopath/pkg/mod/sigs.k8s.io/controller-runtime@v0.6.0/pkg/internal/controller/controller.go:198\nsigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).startLeaderElectionRunnables.func1\n\t/home/travis/gopath/pkg/mod/sigs.k8s.io/controller-runtime@v0.6.0/pkg/manager/internal.go:514"}

What I am also seeing in logs is constant starting and stopping of workers:

{"level":"info","ts":1604488631.755542,"logger":"controller-runtime.controller","msg":"Starting workers","controller":"controller_patchlocker_master-0-label","worker count":1}
{"level":"info","ts":1604488631.7555563,"logger":"controller-runtime.controller","msg":"Stopping workers","controller":"controller_patchlocker_master-0-label"}

Any idea what could be the reason behind it? As mentioned, this started occuring after v0.1.5 upgrade.

On side note, patching actually still works, it is just that it seems there are some status sync errors happening.