kestra-io / plugin-aws

https://kestra.io/plugins/plugin-aws/
Apache License 2.0
15 stars 7 forks source link

Getting "Status 407: Pull failed due to an unauthenticated request..." when using "io.kestra.plugin.aws.cli.AwsCLI" #553

Open oleksii-suprun opened 2 days ago

oleksii-suprun commented 2 days ago

Describe the issue

I have an S3 bucket with files partitioned by date and country. I want to check if files are available for the current date and return the list of available prefixes. When files do not exist, then wait for 5 minutes. I implemented a very straightforward workflow for that.

id: test_flow
namespace: test

labels:
  owner: me
  environment: local
  project: test

inputs:
  - id: startDate
    type: DATE
    required: false

tasks:
  - id: waitForFiles
    type: io.kestra.plugin.core.flow.WaitFor
    condition: "{{ outputs.listObjects.vars.prefixes | length > 0 }}"
    failOnMaxReached: true
    checkFrequency:
      interval: "PT5M"
      maxIterations: 3
    tasks:
      - id: listObjects
        type: io.kestra.plugin.aws.cli.AwsCLI
        accessKeyId: "XXXXX"
        secretKeyId: "XXXXX"
        region: "us-east-1"
        commands:
          - >-
            aws s3api list-objects-v2 \
              --bucket 'my-bucket' \
              --prefix 'files/date={{ trigger.date ?? inputs.startDate ?? execution.startDate | date('yyyy-MM-dd') }}/' \
              --delimiter '/' \
              --query 'reverse(CommonPrefixes[].[Prefix])' \
              --output 'text' \
             | sed 's/^/"/g' | sed 's/$/"/g' | tr '\n' ',' | sed 's/,$//' | sed 's/^/[/' | sed 's/$/]/' \
             | xargs -0 -I {} echo '::{"outputs":{"prefixes":{}}}::'

I tried to use io.kestra.plugin.aws.s3.List, and it works fine, but I have more than 1000 files, and I don't want to implement logic to extract common prefixes. That's why CLI solutions work better for me.

I run Kestra locally using Docker Compose, and the latest available config file from here: https://github.com/kestra-io/kestra/blob/develop/docker-compose.yml.

I deploy my flow with the CLI and Docker container:

docker run --rm --network host --entrypoint "/bin/bash" -v $(pwd)/flows:/app/flows kestra/kestra ./kestra flow namespace update test flows/ --server http://localhost:8080

But during the attempt to trigger flow from the UI, it fails with the following error:

kestra-1    | 2024-11-11 21:06:32,872 ERROR WorkerThread f.d.7.3t8TpEhQV1c2GJjOfBiyOi Status 407: Pull failed due to an unauthenticated request. Registry Access Management is enabled, which requires pulls to be authenticated. Please run `docker login`, or contact your administrators if this is unexpected.
kestra-1    | 
kestra-1    | com.github.dockerjava.api.exception.DockerException: Status 407: Pull failed due to an unauthenticated request. Registry Access Management is enabled, which requires pulls to be authenticated. Please run `docker login`, or contact your administrators if this is unexpected.
kestra-1    | 
kestra-1    |   at com.github.dockerjava.core.DefaultInvocationBuilder.execute(DefaultInvocationBuilder.java:249)
kestra-1    |   at com.github.dockerjava.core.DefaultInvocationBuilder.lambda$executeAndStream$1(DefaultInvocationBuilder.java:269)
kestra-1    |   at java.base/java.lang.Thread.run(Unknown Source)

I would appreciate any help you can give me.

Environment

Ben8t commented 2 days ago

By default the io.kestra.plugin.aws.cli.AwsCLI task use amazon/aws-cli docker image. Are you able to docker pull amazon/aws-cli outside of Kestra first ?

oleksii-suprun commented 2 days ago

Hi, @Ben8t. Thanks for the prompt reply. Yes, I can pull the amazon/aws-cli image without issues. Here is the output:

➜  ~ docker pull amazon/aws-cli
Using default tag: latest
latest: Pulling from amazon/aws-cli
0a62aca1c7d7: Pull complete 
c41bc4d4ec42: Pull complete 
fb7fad32fd00: Pull complete 
cd73a7325d2e: Pull complete 
0e94aaf222b2: Pull complete 
Digest: sha256:e6314637edba91c533fe1615d2d5302066944630a81b63df6c10657ac6b993ff
Status: Downloaded newer image for amazon/aws-cli:latest
docker.io/amazon/aws-cli:latest

What's next:
    View a summary of image vulnerabilities and recommendations → docker scout quickview amazon/aws-cli
oleksii-suprun commented 2 days ago

I also tried running the workflow with the image pulled manually beforehand, but the issue remains the same.

Screenshot 2024-11-12 at 10 50 31

In addition, I tried to use DinD-based compose from here https://github.com/kestra-io/kestra/blob/releases/v0.19.x/docker-compose-dind.yml. It does not have issues with pulling images, but it has a problem with AWS authentication. Randomly, runs fail with the AccessDenied error. Still, after several reruns (subsequent clicks on the "Execute" button) of the same flow without any modifications, it finishes successfully and returns the expected values. The example of the error message is the following:

2024-11-11 21:12:13.819An error occurred (AccessDenied) when calling the ListObjectsV2 operation: User: arn:aws:iam::XXXXXXXX:user/XXXXXXXX is not authorized to perform: s3:ListBucket on resource: "arn:aws:s3:::XXXXXXXX" with an explicit deny in an identity-based policy
Ben8t commented 2 days ago

Ok thanks :) I will try to reproduce on my hand to see if can find the root cause here 👍 will keep you posted