VictoriaMetrics / operator

Kubernetes operator for Victoria Metrics
Apache License 2.0
432 stars 144 forks source link

One cluster role binding for multiple VMagent service accounts #1160

Open QuentinBtd opened 1 day ago

QuentinBtd commented 1 day ago

Hello,

I noticed something unusual, and here’s the situation…

I'm using VM Operator on my cluster, deployed within the monitoring namespace. An initial VM stack is also deployed in this same namespace, using the Helm chart victoria-metrics-k8s-stack.

For a different project, we need to deploy a second VM stack, this time in a separate namespace, which we'll call external-monitoring. This stack is also deployed using the Helm chart victoria-metrics-k8s-stack.

When the first stack is deployed, a service account for vmagent is created in the monitoring namespace, named vmagent-victoria-metrics. Additionally, a cluster role named monitoring:vmagent-cluster-access-victoria-metrics is created.

A cluster role binding, monitoring:vmagent-cluster-access-victoria-metrics, links the cluster role to the service account.

When the second stack is deployed in the external-monitoring namespace, a service account for vmagent is created in the external-monitoring namespace, also named vmagent-victoria-metrics.

However, no new cluster role is created in this case—no external-monitoring:vmagent-cluster-access-victoria-metrics is generated. Only monitoring:vmagent-cluster-access-victoria-metrics exists.

Here’s where it gets strange: the cluster role binding monitoring:vmagent-cluster-access-victoria-metrics is constantly updated. At one moment, it links the vmagent-victoria-metrics service account in the monitoring namespace to the monitoring:vmagent-cluster-access-victoria-metrics cluster role; the next moment, it links the service account from the external-monitoring namespace to this same cluster role. This switches back and forth continuously.

I tried specifying a serviceAccountName for my vmagent in the external-monitoring namespace, but that didn’t change anything.

I also tried to create the second stack without using the helm chart. Same problem.

This is problematic because the first stack is used for cluster monitoring. As a result, the service account temporarily loses its permissions because it’s no longer bound to the cluster role, leading to missing metrics.

f41gh7 commented 1 day ago

Hello, thanks for the reporting.

Could be related to https://github.com/VictoriaMetrics/operator/issues/891

Workaround for it to use serviceAccountName: external-monitoring for the second VMAgent and assign it needed permissions. Also I'd recommend to update operator to the latest version.

QuentinBtd commented 1 day ago

Hello,

Yes, it's the same issue. I hadn't seen this issue in my search, apologies for that.

For now, I've created two ClusterRoleBindings that link each service account from my two namespaces to the monitoring:vmagent-cluster-access-victoria-metrics cluster role. This seems to work as a temporary solution until there is an update to the operator.