kedacore / keda

KEDA is a Kubernetes-based Event Driven Autoscaling component. It provides event driven scale for any container running in Kubernetes
https://keda.sh
Apache License 2.0
8.56k stars 1.08k forks source link

GCP pub/sub MQL no longer supported(recommend PromQL) - Starting October 22, 2024 #6255

Open yazimpact opened 1 month ago

yazimpact commented 1 month ago

Report

GCP Monitoring Page saw below warning:

Starting October 22, 2024, Monitoring Query Language (MQL) will no longer be a recommended query language for Cloud Monitoring, and we will begin to turn off certain usability features. We recommend moving to PromQL , the open-source standard for querying time series. PromQL offers similar functionality to MQL, with a wider user base and more community resources. Learn more

Expected Behavior

Use Promql instead of MQL

Actual Behavior

Current metrics is using MQL:

pubsub.googleapis.com/subscription/num_undelivered_messages
| filter (resource.project_id == 'project-name' && resource.subscription_id == 'example-sub')
| within 1m

Steps to Reproduce the Problem

References: PROMQL: https://cloud.google.com/monitoring/promql MQL: https://cloud.google.com/stackdriver/docs/deprecations/mql

Logs from KEDA operator

example

KEDA Version

2.13.1

Kubernetes Version

1.30

Platform

Google Cloud

Scaler Details

GCP Pub/Sub

Anything else?

https://cloud.google.com/stackdriver/docs/deprecations/mql

On October 22, 2024, Monitoring Query Language (MQL) will no longer be a recommended query language for Cloud Monitoring.

On October 22, 2024, certain usability features will be turned off. On July 22, 2025, MQL will no longer be available for new dashboards and alerts in the Google Cloud console, and Google Cloud customer support will end. Existing MQL dashboards and alerts will continue to work, and you will still be able to create MQL dashboards and alerts using the Cloud Monitoring API.

JorTurFer commented 1 month ago

Interesting update :smile:

@kedacore/keda-core-contributors , do you think that we should get rid of the GCP Scalers or just use them as wrapper on top of the prometheus one to make the things easier to end users?

yazimpact commented 1 month ago

@JorTurFer any updated? Keda started getting errors for scaling:

github.com/kedacore/keda/v2/pkg/scalers.(*pubsubScaler).GetMetricsAndActivity
        /workspace/pkg/scalers/gcp_pubsub_scaler.go:193
github.com/kedacore/keda/v2/pkg/scaling/cache.(*ScalersCache).GetMetricsAndActivityForScaler
        /workspace/pkg/scaling/cache/scalers_cache.go:140
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).getScalerState
        /workspace/pkg/scaling/scale_handler.go:743
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).getScaledObjectState.func1
        /workspace/pkg/scaling/scale_handler.go:628
2024-10-23T11:08:11Z    ERROR   scale_handler   error getting scale decision    {"scaledObject.Namespace": "example", "scaledObject.Name": "example-scaledobject", "scaler": "pubsubScaler", "error": "could not find stackdriver metric with query fetch pubsub_subscription | metric 'pubsub.googleapis.com/subscription/num_undelivered_messages' | filter (resource.project_id == 'project_name' && resource.subscription_id == 'example-sub') | within 1m"}
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).getScalerState
        /workspace/pkg/scaling/scale_handler.go:764
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).getScaledObjectState.func1
        /workspace/pkg/scaling/scale_handler.go:628
wozniakjan commented 1 month ago

am I reading https://cloud.google.com/stackdriver/docs/deprecations/mql correctly by expecting MQL to work until July 22, 2025?

also the same document mentions:

No action is required to continue using your existing MQL assets, and you can create new MQL assets by using the Cloud Monitoring API.

yazimpact commented 1 month ago

Looks like working(on UI did fetch), but now I've updated to 2.15.2 helm-chart version getting unknown on hpas, when upgrade do we need to remove old crds and deploy new ones? SO is is ready active true true state but hpa is unknown state

query:

fetch pubsub_subscription | metric 'pubsub.googleapis.com/subscription/num_undelivered_messages' | filter (resource.project_id == 'project-name' && resource.subscription_id == 'mytopic-sub') | within 5m | align delta(3m) | every 3m | group_by [], sum(value)

hpa:

keda-hpa-pubsub-scaledobject   Deployment/test   <unknown>/2 (avg)   1         10        1          106m

so:

pubsub-scaledobject   apps/v1.Deployment   test              0     10    gcp-pubsub   keda-trigger-auth-gcp-credentials   True    False    Unknown    Unknown   106m

troubleshoot:

 kubectl get apiservice v1beta1.external.metrics.k8s.io        
NAME                              SERVICE                                AVAILABLE   AGE
v1beta1.external.metrics.k8s.io   keda/keda-operator-metrics-apiserver   True        10m

Service account annotated:

apiVersion: v1
automountServiceAccountToken: true
kind: ServiceAccount
metadata:
  annotations:
    iam.gke.io/gcp-service-account: keda@project-name.iam.gserviceaccount.com
  labels:
    app.kubernetes.io/component: operator
    app.kubernetes.io/instance: keda
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: keda-operator
    app.kubernetes.io/part-of: keda-operator
    app.kubernetes.io/version: 2.15.1
    helm.sh/chart: keda-2.15.2
    pci-dss-firewall-audit: pci-dss-2024q1
  name: keda-operator
  namespace: keda

roles are

roles/pubsub.viewer
roles/monitoring.viewer
roles/iam.workloadIdentityUser
roles/pubsub.editor

I've redeployed keda.

I've created test deployment and scaledobject, triggerauth. When publish messages SO detect and SO gets active

 k get so -n test                                   
NAME                  SCALETARGETKIND      SCALETARGETNAME   MIN   MAX   TRIGGERS     AUTHENTICATION                      READY   ACTIVE   FALLBACK   PAUSED    AGE
pubsub-scaledobject   apps/v1.Deployment   test              0     10    gcp-pubsub   keda-trigger-auth-gcp-credentials   True    True     Unknown    Unknown   7m14s

k get hpa -n test
NAME                           REFERENCE         TARGETS             MINPODS   MAXPODS   REPLICAS   AGE
keda-hpa-pubsub-scaledobject   Deployment/test   <unknown>/2 (avg)   1         10        1          7m39s

kgp -n test      
NAME                    READY   STATUS    RESTARTS   AGE
test-8488686db5-qdn89   1/1     Running   0          2m51s

It's adding just 1 pod that's it and metrics are unknown on HPA.

Any solutions or ideas what could be issue?