kedacore / keda

KEDA is a Kubernetes-based Event Driven Autoscaling component. It provides event driven scale for any container running in Kubernetes
https://keda.sh
Apache License 2.0
8.32k stars 1.05k forks source link

keda gcp workload identity not working on keda-operator #5511

Closed ygpr closed 7 months ago

ygpr commented 7 months ago

Hi ,

I am trying to run keda to scale based on pubsub metrics using workload identity. I am continuosly gettting following

024-02-15T10:59:01Z ERROR   scale_handler   error getting metric for trigger    {"scaledObject.Namespace": "default", "scaledObject.Name": "keda-demo-pubsub-scaledobject", "trigger": "pubsubScaler", "error": "could not find stackdriver metric with query fetch pubsub_subscription | metric 'pubsub.googleapis.com/subscription/num_undelivered_messages' | filter (resource.project_id == 'relyance-internal' && resource.subscription_id == 'rd17688subscription') | within 1m"}
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).GetScaledObjectMetrics
    /workspace/pkg/scaling/scale_handler.go:555
github.com/kedacore/keda/v2/pkg/metricsservice.(*GrpcServer).GetMetrics
    /workspace/pkg/metricsservice/server.go:47
github.com/kedacore/keda/v2/pkg/metricsservice/api._MetricsService_GetMetrics_Handler
    /workspace/pkg/metricsservice/api/metrics_grpc.pb.go:99
google.golang.org/grpc.(*Server).processUnaryRPC
    /workspace/vendor/google.golang.org/grpc/server.go:1372
google.golang.org/grpc.(*Server).handleStream
    /workspace/vendor/google.golang.org/grpc/server.go:1783
google.golang.org/grpc.(*Server).serveStreams.func2.1
    /workspace/vendor/google.golang.org/grpc/server.go:1016

I have added firewall rules as well . I have also verified in the google cloud console where above mql query is returning results. but operator always showing above error and pod is not scaling.

gcloud compute firewall-rules create gke-rd17688-rd17688-allow-api-server-to-keda-webhook \
--description="Allow kubernetes api server to keda webhook call on worker nodes TCP port 9443" \
--direction=INGRESS \
--priority=1000 \
--network=networj name \
--action=ALLOW \
--rules=tcp:6443 \
--source-ranges=<control-pane ip range> \
--target-tags=TARGET_TAG
gcloud compute firewall-rules create gke-xyz-xyz-allow-api-server-to-keda-webhook \
--description="Allow kubernetes api server to keda webhook call on worker nodes TCP port 9443" \
--direction=INGRESS \
--priority=1000 \
--network=xyz-xyz\
--action=ALLOW \
--rules=tcp:6443 \
--source-ranges=172.16.4.19/28 \
--target-tags=gke-xyz-xyz-80b5e34d-node

Any pointers would be greatly helpful

i verified that service account has permission

gcloud projects get-iam-policy relyance-internal  \
> --flatten="bindings[].members" \
> --format='table(bindings.role)' \
> --filter="bindings.members:xyz-keda-operator@project_name.iam.gserviceaccount.com"
ROLE
roles/monitoring.viewer

Any pointers would be helpful. Thanks in advance

ygpr commented 7 months ago

my scaled object yaml file i s

apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: keda-demo-trigger-auth-gcp-credentials
spec:
  podIdentity:
    provider: gcp
---
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: keda-demo-pubsub-scaledobject
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1 # Optional. Default: apps/v1
    kind: Deployment    # Optional. Default: Deployment
    name: keda-demo     # Mandatory. Must be in the same namespace as the ScaledObject
  pollingInterval: 10    # Optional. Default: 30 seconds
  minReplicaCount: 1    # Optional. Default: 0
  maxReplicaCount: 10   # Optional. Default: 100
  triggers:
  - type: gcp-pubsub
    authenticationRef:
      kind: TriggerAuthentication
      name: keda-demo-trigger-auth-gcp-credentials
    metadata:
      mode: "SubscriptionSize" # Optional - Default is SubscriptionSize - SubscriptionSize or OldestUnackedMessageAge
      value: "5" # Optional - Default is 5 for SubscriptionSize | Default is 10 for OldestUnackedMessageAge
      subscriptionName: "xyzsubscription" # Mandatory
JorTurFer commented 7 months ago

Hello

What KEDA version are you using? I think that it's not a connectivity issue but a configuration issue? Based on the error, I think that you are using v2.13.0. If I'm right, you could be affected by https://github.com/kedacore/keda/pull/5452

Currently, the workaround if that's you case is using v2.12 or main (until next release)

ygpr commented 7 months ago

Tried with 2.12. it worked. Thanks for taking care of bug in next release.

JorTurFer commented 7 months ago

Nice to know that at least, the bug it the known one. I close this issue as solved as the fix is already merged