aws-controllers-k8s / community

AWS Controllers for Kubernetes (ACK) is a project enabling you to manage AWS services from Kubernetes
https://aws-controllers-k8s.github.io/community/
Apache License 2.0
2.42k stars 256 forks source link

Kafka controller throws IAM permission error despite granting MSKfullAccess #2074

Open Mohammad9227 opened 6 months ago

Mohammad9227 commented 6 months ago

Describe the bug Kafka controller throws IAM permission error despite grantfull MSKfullAccess

Steps to reproduce

  1. Deploy the kafka controller version: 0.0.6 using the script below
    
    export SERVICE=kafka
    export RELEASE_VERSION=$(curl -sL https://api.github.com/repos/aws-controllers-k8s/${SERVICE}-controller/releases/latest | jq -r '.tag_name | ltrimstr("v")')
    export ACK_SYSTEM_NAMESPACE=ack-system
    export AWS_REGION=us-west-2

aws ecr-public get-login-password --region us-east-1 | helm registry login --username AWS --password-stdin public.ecr.aws helm install --create-namespace -n $ACK_SYSTEM_NAMESPACE ack-$SERVICE-controller \ oci://public.ecr.aws/aws-controllers-k8s/$SERVICE-chart --version=$RELEASE_VERSION --set=aws.region=$AWS_REGION

2. Add IRSA to the service account ack-kafka-controller as follows: 
```yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::167687795518:role/irsa-for-kafka-aws-controller
    meta.helm.sh/release-name: ack-kafka-controller
    meta.helm.sh/release-namespace: ack-system
  creationTimestamp: "2024-05-20T07:33:32Z"
  labels:
    app.kubernetes.io/instance: ack-kafka-controller
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: kafka-chart
    app.kubernetes.io/version: 0.0.6
    helm.sh/chart: kafka-chart-0.0.6
    k8s-app: kafka-chart
  name: ack-kafka-controller
  namespace: ack-system
  resourceVersion: "14469143"
  uid: 606b3073-e1c5-4331-bee0-5d3fac9cc738
  1. Deploy kafka with the following configs
    apiVersion: kafka.services.k8s.aws/v1alpha1
    kind: Cluster
    metadata:
    creationTimestamp: "2024-05-20T07:38:39Z"
    finalizers:
    - finalizers.kafka.services.k8s.aws/Cluster
    generation: 1
    name: my-kafka-cluster
    namespace: ack-system
    resourceVersion: "14491065"
    uid: d4a3a7c0-60c8-4a40-85e2-2c8cf5de35a5
    spec:
    brokerNodeGroupInfo:
    brokerAZDistribution: DEFAULT
    clientSubnets:
    - subnet-xxxxxxxx
    - subnet-xxxxxxxx
    connectivityInfo:
      publicAccess:
        type_: DISABLED
    instanceType: kafka.m5.large
    securityGroups:
    - sg-xxxxxxxx
    storageInfo:
      ebsStorageInfo:
        provisionedThroughput:
          enabled: true
          volumeThroughput: 300
        volumeSize: 1000
    clientAuthentication:
    sasl:
      iam:
        enabled: true
      scram:
        enabled: false
    tls:
      certificateAuthorityARNList:
      - arn:aws:acm:region:account-id:certificate/certificate-id
      enabled: true
    unauthenticated:
      enabled: false
    configurationInfo:
    arn: arn:aws:kafka:region:account-id:configuration/configuration-id
    revision: 1
    encryptionInfo:
    encryptionAtRest:
      dataVolumeKMSKeyID: arn:aws:kms:region:account-id:key/key-id
    encryptionInTransit:
      clientBroker: TLS
      inCluster: true
    enhancedMonitoring: PER_BROKER
    kafkaVersion: 2.8.0
    loggingInfo:
    brokerLogs:
      cloudWatchLogs:
        enabled: true
        logGroup: /aws/kafka/broker-log-group
      firehose:
        deliveryStream: kafka-delivery-stream
        enabled: false
      s3:
        bucket: kafka-logs-bucket
        enabled: true
        prefix: kafka-logs
    name: my-kafka-cluster
    numberOfBrokerNodes: 3
    openMonitoring:
    prometheus:
      jmxExporter:
        enabledInBroker: true
      nodeExporter:
        enabledInBroker: true
    storageMode: EBS
    tags:
    Environment: production
    Project: kafka
    {"level":"error","ts":"2024-05-20T08:30:37.340Z","msg":"Reconciler error","controller":"cluster","con │
    │ trollerGroup":"kafka.services.k8s.aws","controllerKind":"Cluster","Cluster":{"name":"my-kafka-cluster","namespace":"ack-system"},"na │
    │ mespace":"ack-system","name":"my-kafka-cluster","reconcileID":"38f09a82-4964-4b1a-bfdb-5d635e4cd22a","error":"AccessDeniedException: │
    │  User: arn:aws:sts::167687795518:assumed-role/citadel-sandbox-irsa-for-kafka-aws-controller/1716193793285909782 is not authorized to │
    │  perform: kafka:CreateCluster on resource: *\n\tstatus code: 403, request id: d4af9565-46ce-4140-b580-b9012bd60b15","stacktrace":"si │
    │ gs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0 │
    │ .17.2/pkg/internal/controller/controller.go:329\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWor │
    │ kItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.2/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtim │
    │ e/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.2/pkg/internal/controller/ │
    │ controller.go:227"}

    Expected outcome A concise description of what you expected to happen.

Environment

I have already tried pod identity and Node permissions. Still get the same error

ack-bot commented 10 hours ago

Issues go stale after 180d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 60d of inactivity and eventually close. If this issue is safe to close now please do so with /close. Provide feedback via https://github.com/aws-controllers-k8s/community. /lifecycle stale