kedacore / keda

KEDA is a Kubernetes-based Event Driven Autoscaling component. It provides event driven scale for any container running in Kubernetes
https://keda.sh
Apache License 2.0
8.51k stars 1.07k forks source link

Pub/Sub Scaler: Inappropriate alignment #6052

Closed karotchykau closed 2 weeks ago

karotchykau commented 3 months ago

Report

When you apply gcp-pubsub, it won't trigger upscaling for some metrics in most cases if there are no changes in traffic.

For instance, let's consider NumUndeliveredMessages. If someone publishes a lot of different messages, stops, and timeHorizon gets passed, the resource will start scaling down to 0 even there are still a lot of unprocessed messages. The reason for this is that it always assigns a static DELTA alignment (https://github.com/kedacore/keda/blob/v2.15.0/pkg/scalers/gcp/gcp_stackdriver_client.go#L370); therefore, if there were 1,000,000 messages published and we could process only 100,000 of them during timeHorizon, the rest (900,000) would stay unacknowledged because our cluster got downscaled to 0 and we'll not be upscaled unless someone publishes a new message.

P.S. It'll actually have negative numbers for DELTA, but it doesn't make it different.

Expected Behavior

You should be able to somehow configure the alignment type (as well as the interval).

Actual Behavior

You cannot configure the alignment type (as well as the interval); therefore, old messages will not be taken into account.

Steps to Reproduce the Problem

  1. Create ScaledObject for gcp-pubsub. Here is an example of parameters that I used before:
      mode: "NumUndeliveredMessages"
      aggregation: "mean"
      value: "100"
      timeHorizon: "5m"
  2. In your scaleTargetRef (Deployment/etc.) configure some kind of delay so it processes messages slower.
  3. Publish a decent number of messages that you know won't be fully processed during timeHorizon (e.g. 100,000).

Logs from KEDA operator

N/A

Logs are clear and without any errors indicating the scaling process that I described above.

KEDA Version

2.15.0

Kubernetes Version

1.29

Platform

Google Cloud

Scaler Details

Pub/Sub

Anything else?

For those who encountered the same issue. You can just fall back to gcp-stackdriver instead:

      projectId: PROJECT
      filter: 'resource.type="pubsub_subscription" AND resource.labels.subscription_id="SUBSCRIPTION_ID" AND metric.type="pubsub.googleapis.com/subscription/num_undelivered_messages"'
      ...
karotchykau commented 3 months ago

N.B.

Without any aggregation (""), it will just skip the alignment configuration. Here is the result from BuildMQLQuery:

fetch pubsub_subscription
| metric 'pubsub.googleapis.com/subscription/num_undelivered_messages'
| filter (resource.project_id == 'PROJECT' && resource.subscription_id == 'SUBSCRIPTION_ID')
| within 5m

But it just skips it, that's it. Also, it's unclear in this case what kind of reduction is done after the within 5m above.

stale[bot] commented 3 weeks ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

stale[bot] commented 2 weeks ago

This issue has been automatically closed due to inactivity.