kedacore / keda

KEDA is a Kubernetes-based Event Driven Autoscaling component. It provides event driven scale for any container running in Kubernetes
https://keda.sh
Apache License 2.0
8.32k stars 1.05k forks source link

GCP PubSub Trigger with Workload Identity not working #5011

Closed juldrixx closed 8 months ago

juldrixx commented 12 months ago

Report

I followed the example to scale on PubSub metrics using Workload Identity as authentication method following theses pages:

I have this 2 resources:

apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: keda
  namespace: instances
spec:
  podIdentity:
    provider: gcp
---
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: nifikop
  namespace: instances
spec:
  scaleTargetRef:
    apiVersion: nifi.konpyutaika.com/v1alpha1
    kind: NifiNodeGroupAutoscale
    name: nifinodegroupautoscaler-sample
    envSourceContainerName: nifi
  pollingInterval: 30
  cooldownPeriod: 300
  minReplicaCount: 1
  maxReplicaCount: 2
  triggers:toMonitor",counter_name=~"squidflow-mapping-enrichment2\\.streaming\\.input\\.flowfiles.items"})
    - type: gcp-pubsub
      authenticationRef:
        name: keda
      metadata:
        subscriptionName: "keda"
        mode: "SubscriptionSize"
        value: "5"

And the Service Account, keda-operator, has the annotation: iam.gke.io/gcp-service-account. I tested that the Workload Identity was working on the Service Account.

Expected Behavior

The ScaledObject should be active.

Actual Behavior

The ScaledObject is inactive and logging: error parsing PubSub metadata: google application credentials not found.

Steps to Reproduce the Problem

Follow the example: https://keda.sh/docs/2.11/scalers/gcp-pub-sub/#example-using-triggerauthentication-with-gcp-identity

Logs from KEDA operator

"level":"error","ts":"2023-09-27T16:16:23+02:00","msg":"failed to ensure HPA is correctly created for ScaledObject","controller":"scaledobject","controllerGroup":"keda.sh","controllerKind":"ScaledObject","ScaledObject":{"name":"nifikop","namespace":"instances"},"namespace":"instances","name":"nifikop","reconcileID":"ee8134bf-092f-4548-acf5-6e1ddfa4e484","error":"error parsing PubSub metadata: google application credentials not found","stacktrace":"github.com/kedacore/keda/v2/controllers/keda.(*ScaledObjectReconciler).Reconcile\n\t/home/juguitton/Documents/sandbox/k8s_resources/keda/controllers/keda/scaledobject_controller.go:179\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/home/juguitton/Documents/sandbox/k8s_resources/keda/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:118\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/home/juguitton/Documents/sandbox/k8s_resources/keda/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:314\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/juguitton/Documents/sandbox/k8s_resources/keda/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:265\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/home/juguitton/Documents/sandbox/k8s_resources/keda/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:226"}
{"level":"error","ts":"2023-09-27T16:16:23+02:00","msg":"Reconciler error","controller":"scaledobject","controllerGroup":"keda.sh","controllerKind":"ScaledObject","ScaledObject":{"name":"nifikop","namespace":"instances"},"namespace":"instances","name":"nifikop","reconcileID":"ee8134bf-092f-4548-acf5-6e1ddfa4e484","error":"error parsing PubSub metadata: google application credentials not found","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/home/juguitton/Documents/sandbox/k8s_resources/keda/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:324\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/juguitton/Documents/sandbox/k8s_resources/keda/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:265\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/home/juguitton/Documents/sandbox/k8s_resources/keda/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:226"}

KEDA Version

2.11.2

Kubernetes Version

1.27

Platform

Google Cloud

Scaler Details

GCP PubSub

Anything else?

For me the issue is here . Because it should retrieve the providerIdentity but it's not returned (and the authParams is empty).

Therefore the config.PodIdentity here is set to none. And then when we get the credentials, we get nothing here and we make the scaler erroring here.

Either the documentation is wrong, or there's something missing in the code (or I'm missing something).

juldrixx commented 12 months ago

For me these lines should be changed from

    authParams, _ := resolveAuthRef(ctx, client, logger, triggerAuthRef, nil, namespace, secretsLister)
    return authParams, kedav1alpha1.AuthPodIdentity{Provider: kedav1alpha1.PodIdentityProviderNone}, nil
}

...
func resolveAuthRef(ctx context.Context, client client.Client, logger logr.Logger,
...
    result := make(map[string]string)
    var podIdentity kedav1alpha1.AuthPodIdentity

to

    authParams, podIdentity  := resolveAuthRef(ctx, client, logger, triggerAuthRef, nil, namespace, secretsLister)
    return authParams, podIdentity, nil
}

...
func resolveAuthRef(ctx context.Context, client client.Client, logger logr.Logger,
...
    result := make(map[string]string)
    var podIdentity kedav1alpha1.AuthPodIdentity{Provider: kedav1alpha1.PodIdentityProviderNone}
JorTurFer commented 12 months ago

Hello, I'm reviewing the code and I think that you are right. On cases where the workload is a CRD (where podSpec isn't available), Pod identity is ignored and that's not correct. Are you willing to open a PR with the fix? (and maybe adding any test to prevent this in the future as current e2e test isn't covering that scenario)

juldrixx commented 12 months ago

I made the PR, but I did not touch the unit test because I'm not too sure on how to do it.

eahrend commented 10 months ago

FWIW, we're having the same issue here. Same setup on 2.12.0