banzaicloud / koperator

Oh no! Yet another Apache Kafka operator for Kubernetes
Apache License 2.0
784 stars 195 forks source link

Operator fails to start when CertificateSigningRequests read access is not allowed #666

Closed amuraru closed 3 years ago

amuraru commented 3 years ago

Is your feature request related to a problem? Please describe.

In our environment is not possible to grant operator read access to CertifiticateSigningRequests cluster-scoped resources. In turn, when this is not available the operator as whole fails to start with this errors:

I0910 11:34:15.606806       1 leaderelection.go:258] successfully acquired lease ns-team-aep-pipeline-mgmt-cicd/controller-leader-election-helper
{"level":"info","ts":"2021-09-10T11:34:15.606Z","logger":"controller-runtime.manager.controller.CruiseControl","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster","source":"kind source: /, Kind="}
{"level":"info","ts":"2021-09-10T11:34:15.606Z","logger":"controller-runtime.manager.controller.CruiseControl","msg":"Starting Controller","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster"}
{"level":"info","ts":"2021-09-10T11:34:15.606Z","logger":"controller-runtime.manager.controller.KafkaCluster","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster","source":"kind source: /, Kind="}
{"level":"info","ts":"2021-09-10T11:34:15.607Z","logger":"controller-runtime.manager.controller.KafkaCluster","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster","source":"kind source: /, Kind="}
{"level":"info","ts":"2021-09-10T11:34:15.607Z","logger":"controller-runtime.manager.controller.KafkaCluster","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster","source":"kind source: /, Kind="}
{"level":"info","ts":"2021-09-10T11:34:15.606Z","logger":"controller-runtime.manager.controller.kafkatopic","msg":"Starting EventSource","source":"kind source: /, Kind="}
{"level":"info","ts":"2021-09-10T11:34:15.607Z","logger":"controller-runtime.manager.controller.KafkaCluster","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster","source":"kind source: /, Kind="}
{"level":"info","ts":"2021-09-10T11:34:15.607Z","logger":"controller-runtime.manager.controller.kafkatopic","msg":"Starting Controller"}
{"level":"info","ts":"2021-09-10T11:34:15.607Z","logger":"controller-runtime.manager.controller.KafkaCluster","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster","source":"kind source: /, Kind="}
{"level":"info","ts":"2021-09-10T11:34:15.607Z","logger":"controller-runtime.manager.controller.KafkaCluster","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster","source":"kind source: /, Kind="}
{"level":"info","ts":"2021-09-10T11:34:15.607Z","logger":"controller-runtime.manager.controller.KafkaCluster","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster","source":"kind source: /, Kind="}
{"level":"info","ts":"2021-09-10T11:34:15.607Z","logger":"controller-runtime.manager.controller.KafkaCluster","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster","source":"kind source: /, Kind="}
{"level":"info","ts":"2021-09-10T11:34:15.607Z","logger":"controller-runtime.manager.controller.KafkaCluster","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster","source":"kind source: /, Kind="}
{"level":"info","ts":"2021-09-10T11:34:15.607Z","logger":"controller-runtime.manager.controller.KafkaCluster","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster","source":"kind source: /, Kind="}
{"level":"info","ts":"2021-09-10T11:34:15.607Z","logger":"controller-runtime.manager.controller.KafkaCluster","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster","source":"kind source: /, Kind="}
{"level":"info","ts":"2021-09-10T11:34:15.607Z","logger":"controller-runtime.manager.controller.KafkaCluster","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster","source":"kind source: /, Kind="}
{"level":"info","ts":"2021-09-10T11:34:15.607Z","logger":"controller-runtime.manager.controller.KafkaUser","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaUser","source":"kind source: /, Kind="}
{"level":"info","ts":"2021-09-10T11:34:15.607Z","logger":"controller-runtime.manager.controller.KafkaCluster","msg":"Starting Controller","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster"}
{"level":"info","ts":"2021-09-10T11:34:15.607Z","logger":"controller-runtime.manager.controller.KafkaUser","msg":"Starting EventSource","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaUser","source":"kind source: /, Kind="}
{"level":"info","ts":"2021-09-10T11:34:15.607Z","logger":"controller-runtime.manager.controller.KafkaUser","msg":"Starting Controller","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaUser"}
E0910 11:34:15.609484       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.22.1/tools/cache/reflector.go:167: Failed to watch *v1.CertificateSigningRequest: failed to list *v1.CertificateSigningRequest: certificatesigningrequests.certificates.k8s.io is forbidden: User "system:serviceaccount:ns-team-aep-pipeline-mgmt-cicd:kafka-operator" cannot list resource "certificatesigningrequests" in API group "certificates.k8s.io" at the cluster scope
E0910 11:34:16.850764       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.22.1/tools/cache/reflector.go:167: Failed to watch *v1.CertificateSigningRequest: failed to list *v1.CertificateSigningRequest: certificatesigningrequests.certificates.k8s.io is forbidden: User "system:serviceaccount:ns-team-aep-pipeline-mgmt-cicd:kafka-operator" cannot list resource "certificatesigningrequests" in API group "certificates.k8s.io" at the cluster scope
E0910 11:34:18.716700       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.22.1/tools/cache/reflector.go:167: Failed to watch *v1.CertificateSigningRequest: failed to list *v1.CertificateSigningRequest: certificatesigningrequests.certificates.k8s.io is forbidden: User "system:serviceaccount:ns-team-aep-pipeline-mgmt-cicd:kafka-operator" cannot list resource "certificatesigningrequests" in API group "certificates.k8s.io" at the cluster scope
E0910 11:34:24.144171       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.22.1/tools/cache/reflector.go:167: Failed to watch *v1.CertificateSigningRequest: failed to list *v1.CertificateSigningRequest: certificatesigningrequests.certificates.k8s.io is forbidden: User "system:serviceaccount:ns-team-aep-pipeline-mgmt-cicd:kafka-operator" cannot list resource "certificatesigningrequests" in API group "certificates.k8s.io" at the cluster scope
E0910 11:34:33.785727       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.22.1/tools/cache/reflector.go:167: Failed to watch *v1.CertificateSigningRequest: failed to list *v1.CertificateSigningRequest: certificatesigningrequests.certificates.k8s.io is forbidden: User "system:serviceaccount:ns-team-aep-pipeline-mgmt-cicd:kafka-operator" cannot list resource "certificatesigningrequests" in API group "certificates.k8s.io" at the cluster scope
E0910 11:34:48.592671       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.22.1/tools/cache/reflector.go:167: Failed to watch *v1.CertificateSigningRequest: failed to list *v1.CertificateSigningRequest: certificatesigningrequests.certificates.k8s.io is forbidden: User "system:serviceaccount:ns-team-aep-pipeline-mgmt-cicd:kafka-operator" cannot list resource "certificatesigningrequests" in API group "certificates.k8s.io" at the cluster scope
E0910 11:35:38.458333       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.22.1/tools/cache/reflector.go:167: Failed to watch *v1.CertificateSigningRequest: failed to list *v1.CertificateSigningRequest: certificatesigningrequests.certificates.k8s.io is forbidden: User "system:serviceaccount:ns-team-aep-pipeline-mgmt-cicd:kafka-operator" cannot list resource "certificatesigningrequests" in API group "certificates.k8s.io" at the cluster scope
{"level":"error","ts":"2021-09-10T11:36:15.607Z","logger":"controller-runtime.manager.controller.CruiseControl","msg":"Could not wait for Cache to sync","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster","error":"failed to wait for CruiseControl caches to sync: timed out waiting for cache to be synced","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.6/pkg/internal/controller/controller.go:195\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.6/pkg/internal/controller/controller.go:221\nsigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).startRunnable.func1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.6/pkg/manager/internal.go:696"}
{"level":"error","ts":"2021-09-10T11:36:15.607Z","logger":"controller-runtime.manager.controller.KafkaUser","msg":"Could not wait for Cache to sync","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaUser","error":"failed to wait for KafkaUser caches to sync: timed out waiting for cache to be synced","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.6/pkg/internal/controller/controller.go:195\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.6/pkg/internal/controller/controller.go:221\nsigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).startRunnable.func1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.6/pkg/manager/internal.go:696"}
{"level":"error","ts":"2021-09-10T11:36:15.607Z","logger":"controller-runtime.manager.controller.KafkaCluster","msg":"Could not wait for Cache to sync","reconciler group":"kafka.banzaicloud.io","reconciler kind":"KafkaCluster","error":"failed to wait for KafkaCluster caches to sync: timed out waiting for cache to be synced","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.6/pkg/internal/controller/controller.go:195\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.6/pkg/internal/controller/controller.go:221\nsigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).startRunnable.func1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.6/pkg/manager/internal.go:696"}
{"level":"error","ts":"2021-09-10T11:36:15.607Z","logger":"controller-runtime.manager.controller.kafkatopic","msg":"Could not wait for Cache to sync","error":"failed to wait for kafkatopic caches to sync: timed out waiting for cache to be synced","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.6/pkg/internal/controller/controller.go:195\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.6/pkg/internal/controller/controller.go:221\nsigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).startRunnable.func1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.6/pkg/manager/internal.go:696"}
{"level":"error","ts":"2021-09-10T11:36:15.608Z","logger":"controller-runtime.manager","msg":"error received after stop sequence was engaged","error":"failed to wait for KafkaUser caches to sync: timed out waiting for cache to be synced"}
{"level":"error","ts":"2021-09-10T11:36:15.608Z","logger":"controller-runtime.manager","msg":"error received after stop sequence was engaged","error":"failed to wait for KafkaCluster caches to sync: timed out waiting for cache to be synced"}
{"level":"error","ts":"2021-09-10T11:36:15.608Z","logger":"controller-runtime.manager","msg":"error received after stop sequence was engaged","error":"failed to wait for kafkatopic caches to sync: timed out waiting for cache to be synced"}

Describe the solution you'd like to see

When CSR RBAC is missing the operator should continue working without support for KafkaUser native CSR.

Describe alternatives you've considered

N/A

Additional context N/A

adamantal commented 3 years ago

I think it's dupe of #651 (fixed in #657). If you use custom service accounts, then you may have to add permission to the certificatesigningrequests resources.

amuraru commented 3 years ago

@adamantal I am aware of that fix but our use case is a bit different - we cannot get the permission as we operate in a multitenant k8s environment and would need this support to be optional so we can disable it