VmAgents Shards Autoscaling Issues

togikiran commented 2 months ago

@f41gh7 Why the vmagent pods are getting rotated whenever we are increasing the replica-count ? Command used: kubectl scale vmagent-shard-ha --replicas=3

How should we configure hpa to scale based on cpu/memory utilisation ?

What the above hpa scaling metric is showing unknow NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE vmagent-shard-ha VMAgent/vmagent-shard-ha <unknown>/80% 3 3 3 14m

Issue: missing label selector status kubectl get --raw /apis/operator.victoriametrics.com/v1beta1/namespaces/<namespace>/vmagents/monitoring-vmagent-ha/scale {"kind":"Scale","apiVersion":"autoscaling/v1","metadata":{"name":"monitoring-vmagent-ha","namespace":"<namespace>","uid":"d80e7371-cd49-4caa-8765-3a78220f9543","resourceVersion":"351547683","creationTimestamp":"2024-04-17T12:26:29Z"},"spec":{"replicas":3},"status":{"replicas":3}}

vm-operator version: v0.30.0 vmagent: v1.90.0

Haleygo commented 2 months ago

Hi @togikiran ,

Why the vmagent pods are getting rotated whenever we are increasing the replica-count ? Command used: kubectl scale vmagent-shard-ha --replicas=3

What do you mean by pods are getting rotated, could you share the pods status? If you scale vmagent using kubectl scale vmagent/vmagent-shard-ha --replicas=3 from replicas=1, the expected behavior is new pods created, old pod stay.

What the above hpa scaling metric is showing unknow

It's unknown because metrics server doesn't know what's vmagent's cpu utilization now, you need to create the metric and report to metrics server.

Issue: missing label selector status

That's a bug, since we don't have label propagation in vmagent.status now. We can add them, but I'm not sure if that's useful. Like in this hpa case, hpa doesn't have to know the pod label selector, it only need to scale the vmagent.shardCount, and operator will scale pods.

togikiran commented 2 months ago

Hey @Haleygo

When i increased the replicas from 2->3 , the older pods got rolledout and new pods came. Shared the screenshots

It's unknown because metrics server doesn't know what's vmagent's cpu utilization now, you need to create the metric and report to metrics server.

Default metric-server knows the pods metrics right, do you mean we need to add for vmagent customresource as well ? If yes can you help with the approach ?

Can you please share a sample hpa yaml file (k8s) for autoscaling vmagents shards based on cpu.

Thanks

f41gh7 commented 2 months ago

Hello,

Due to current sharding implementation of vmagent, all flags for the all vmagents must be changed. It requires restart of all pods with new flag value.

togikiran commented 2 months ago

Hey @f41gh7 , Is there a way to skip or bypass the pods restart because this will impact and restart for every scaleup. Is there any mitigation inplace for this issue ?

Thanks

Haleygo commented 2 months ago

Is there a way to skip or bypass the pods restart because this will impact and restart for every scaleup

No, it won't work if pod doesn't restart with new -promscrape.cluster.membersCount arg. Imaging you have vmagent with shardCount: 1 and set hpa scale threshold to cpu>80%. At first it scrapes 100 targets with -promscrape.cluster.membersCount=1 -promscrape.cluster.memberNum=0. Then targets number bumps to 200 and cpu exceeds 80%, the hpa helps increase the vmagent. shardCount to 2, which means there will be two vmagent instances sharding 200 targets, each instance still scrape 100 targets with -promscrape.cluster.membersCount=2 -promscrape.cluster.memberNum=0 or 1. In this way, cpu usage will go down. If we don't change the -promscrape.cluster.membersCount for each instance, both of them will scrape 200 targets, cpu usage won't go down and there is no point to have hpa.

Default metric-server knows the pods metrics right, do you mean we need to add for vmagent customresource as well ? If yes can you help with the approach ?

I'd recommend to use keda here. It can use prometheus as direct trigger, like this

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: test
spec:
  scaleTargetRef:
    kind: vmagent
    name: vmagent-test
  minReplicaCount: 2
  maxReplicaCount: 3
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://vmselect-address
      metricName: vmagent-cpu-usage
      threshold: '80'
      query: container_cpu_usage{you-container-here}

Haleygo commented 2 months ago

Issue: missing label selector status

jfyi, status label selector should be fixed in https://github.com/VictoriaMetrics/operator/commit/a6e3ad72ce6bb78bea5b45f9e7513f5ab3b996d0.

togikiran commented 1 month ago

@Haleygo observed metric loss during vmagents pod scaleup i.e all pods are getting recreated after increase in replica count. This is impacting production clusters. Is there any workaround for this ?

togikiran commented 3 weeks ago

Observing metric loss while vmagent scaling. Added hpa on cpu and memory metrics, pods are getting rolled out and observing metric loss. Production clusters are impacted Operator: v0.44.0 Vmagent version: v1.90.0 @f41gh7

Haleygo commented 3 weeks ago

@Haleygo observed metric loss during vmagents pod scaleup i.e all pods are getting recreated after increase in replica count. This is impacting production clusters. Is there any workaround for this ?

hmm, I'm afraid that's expected with current implementation, workaround would be also set vmagentSpec.replicaCount=2, and enable deduplication in vmcluster.

VictoriaMetrics / operator

VmAgents Shards Autoscaling Issues #924