GoogleCloudPlatform / k8s-stackdriver

Apache License 2.0
390 stars 211 forks source link

Custom log-based metric not recognized by HPA #683

Open az-software-engineer opened 4 months ago

az-software-engineer commented 4 months ago

We've setup Stackdriver and created a custom log-based metric to scale our workload. We've followed the readme documentation provided, but for whatever reason HPA is not able to scale "Unable to read metric". When we look at our custom log-based metric in Metrics Explorer we see data so we know the custom metric was created correctly. We added labels to the custom log based metric to bind it to the correct GKE resource as mentioned in the documentation, which we validated, but still nothing. We even tried configuring the log-based metric as both External and Custom, but neither worked.

Is HPA able to scale off Custom Log-Based Metrics (logging.googleapis.com) VS. Custom Metrics (custom.googleapis.com)?

Stackdriver:

image

Custom Metric:

image

HPA:

image

CatherineF-dev commented 4 months ago

Could you share your HPA configurations after removing sensitive information? cc @az-software-engineer

An example: https://stackoverflow.com/questions/78130483/horizontal-pod-autoscaler-with-stackdriver-custom-metric-in-gke-fails-with-inva

az-software-engineer commented 4 months ago

@CatherineF-dev - Please see below.

Custom Metric:

image

Custom Log-Based Metric:

image

Example Values: minReplicas: 1 maxReplicas: 10 averageValue: 50 backend_target_name: xyz

** Is HPA able to scale off Custom Log-Based Metrics (logging.googleapis.com) VS. Custom Metrics (custom.googleapis.com)?

CatherineF-dev commented 4 months ago

This is a working example https://stackoverflow.com/questions/78130483/horizontal-pod-autoscaler-with-stackdriver-custom-metric-in-gke-fails-with-inva.

Could you follow this to use loadbalacing.googleapis.com|https|request_count?

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-autoscale
  namespace: somenamespace
spec:
  maxReplicas: 3
  metrics:
  - pods:
      metric:
        name: prometheus.googleapis.com|jvm_memory_bytes_used|gauge
        selector:
           matchLabels:
             metric.labels.area: heap
      target:
        averageValue: 2G
        type: AverageValue
  type: Pods
  minReplicas: 1
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-java-app
az-software-engineer commented 4 months ago

Thank you for the example. We've attempted the provided but using logging.googleapis.com rather than Prometheus. GKE accepts the new config but HPA still remains "Unable to read all metrics". Maybe the example worked due to the Prometheus use case? image

CatherineF-dev commented 4 months ago

For logging.googleapis.com metrics, found a similar example here https://cloud.google.com/kubernetes-engine/docs/tutorials/autoscaling-metrics#pubsub_8

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: pubsub
spec:
  minReplicas: 1
  maxReplicas: 5
  metrics:
  - external:
      metric:
       name: pubsub.googleapis.com|subscription|num_undelivered_messages
       selector:
         matchLabels:
           resource.labels.subscription_id: echo-read
      target:
        type: AverageValue
        averageValue: 2
    type: External
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: pubsub
az-software-engineer commented 3 months ago

Unfortunately, that is not working either. In the example provided, you're referencing "pubsub.googleapis.com" which we have successfully configured and tested. However, we're trying to get "logging.googleapis.com" to work using a user-defined custom metric that captures log events.

apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler metadata: name: abc-hpa namespace: abc-frontend spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: abc-server-deployment minReplicas: {{ .Values.abc.hpa.min_replicas }} maxReplicas: {{ .Values.abc.hpa.max_replicas }} metrics: