Open bygui86 opened 2 years ago
Hey @bygui86,
This log comes from the Kubernetes client-go package we use. As you can see here, this is reported when listing the resources takes too long time (more than 10s). Interestingly, it looks like in your case it never goes to the next step after "Objects listed", even if the ListAndWatch
should list all items and then establish a single connection to watch the resources (after a successful list - here).
I couldn't reproduce this behavior on a few long-running Botkube 0.15 installations (k3d, colima and Oracle Cloud clusters). I'd suspect some issues with your cluster (K8s API server specifically), or some inability to set up watch connection from the Botkube pod. Do you have proper RBAC configured for Botkube pod like in the default configuration?
And yes, we can suppress such logs (like here: https://github.com/werf/werf/pull/1754/files), but I'm not sure if this is a good idea 🤔 This might sometimes help with debugging some issues. But one thing we can do is to link these traces with info
log level only.
But here, in this case, I think there's an error which results in constantly listing all resources. Do the Botkube notifications work for you as they should?
@pkosiec thanks for your quick answer.
I couldn't reproduce this behavior on a few long-running Botkube 0.15 installations (k3d, colima and Oracle Cloud clusters).
I run Botkube in 2 GKE cluster, one public, one private.
I'd suspect some issues with your cluster (K8s API server specifically), or some inability to set up watch connection from the Botkube pod. Do you have proper RBAC configured for Botkube pod like in the default configuration?
I cannot give Botkube the default amount of permissions, it's too high. Especially about Secrets. I configured RBAC and tested it:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: botkube
rules:
# K8s
- apiGroups:
- ""
resources:
- events
- configmaps
- endpoints
- limitranges
- namespaces
- nodes
- persistentvolumeclaims
- persistentvolumes
- pods
- resourcequotas
- serviceaccounts
- services
verbs:
- get
- watch
- list
- apiGroups:
- apps
resources:
- deployments
- statefulsets
- daemonsets
verbs:
- get
- watch
- list
- apiGroups:
- batch
resources:
- cronjobs
- jobs
verbs:
- get
- watch
- list
- apiGroups:
- policy
resources:
- poddisruptionbudgets
- podsecuritypolicies
verbs:
- get
- watch
- list
- apiGroups:
- apiextensions.k8s.io
resources:
- customresourcedefinitions
verbs:
- get
- watch
- list
- apiGroups:
- autoscaling
resources:
- horizontalpodautoscalers
verbs:
- get
- watch
- list
- apiGroups:
- rbac.authorization.k8s.io
resources:
- clusterrolebindings
- clusterroles
- rolebindings
- roles
verbs:
- get
- watch
- list
- apiGroups:
- admissionregistration.k8s.io
resources:
- mutatingwebhookconfigurations
- validatingwebhookconfigurations
verbs:
- get
- watch
- list
- apiGroups:
- networking.k8s.io
resources:
- networkpolicies
- ingresses
verbs:
- get
- watch
- list
- apiGroups:
- scheduling.k8s.io
resources:
- priorityclasses
verbs:
- get
- watch
- list
- apiGroups:
- storage.k8s.io
resources:
- storageclasses
verbs:
- get
- watch
- list
- apiGroups:
- snapshot.storage.k8s.io
resources:
- volumesnapshotclasses
- volumesnapshotcontents
- volumesnapshot
verbs:
- get
- watch
- list
# CRDs
- apiGroups:
- cloud.google.com
- internal.autoscaling.gke.io
- networking.gke.io
- hub.gke.io
- nodemanagement.gke.io
- crd.projectcalico.org
- rbacmanager.reactiveops.io
- cassandraoperator.instaclustr.com
- kafka.strimzi.io
- monitoring.coreos.com
- logging.banzaicloud.io
- logging-extensions.banzaicloud.io
- traefik.containo.us
resources:
- "*"
verbs:
- get
- watch
- list
And yes, we can suppress such logs (like here: https://github.com/werf/werf/pull/1754/files), but I'm not sure if this is a good idea 🤔 This might sometimes help with debugging some issues. But one thing we can do is to link these traces with info log level only.
I see your point but this should be a "debug" level log, not info. I suggest not to suppress it, but to make level lower so we could enable it in case of deeper debugging required.
But here, in this case, I think there's an error which results in constantly listing all resources. Do the Botkube notifications work for you as they should?
Notifications seem to work fine, but we have a huge configuration so I cannot distinguish if we are getting the same amount of notifications as before. I noticed that such logs appears 4/5 time after Botkube startup and then never more for hours.
I noticed that such logs appears 4/5 time after Botkube startup and then never more for hours.
Oh, okay, so that means it's not a big issue. Alright, I think we can reconfigure klog based on the log configuration 👍 The only hesitation from my side is that the traces you see are actually in the klog
INFO level, so I'm not sure if they should be moved to debug 🤔 @mszostok WDYT?
@pkosiec for now not a big issue, but I don't know long-term. Anyway I configured Botkube to log in JSON format, so it would be really nice to have such logs converted in JSON as well.
Description
In Botkube logs I see multiple times following trace error:
What is it related to? How can I fix them?
It seems there is no impact on Botkube functionalities, but it's raising attention of our alerts...
Expected behavior
Do not see any trace error log.
Steps to reproduce
Additional information
0.15.0
1.24.4-gke.800