duckdb / duckdb_aws

MIT License
34 stars 12 forks source link

Support AWS_WEB_IDENTITY_TOKEN_FILE as credential provider #16

Closed cpaika closed 6 months ago

cpaika commented 8 months ago

I'm running DuckDB on an EKS pod using IRSA to link a Kubernetes Service account with an IAM role. The iam role has full S3 permissions on a bucket.

What I'm observing:

  1. On the Kubernetes pod this authentication is set:
    AWS_ROLE_ARN=<role arn>
    AWS_WEB_IDENTITY_TOKEN_FILE=/var/run/secrets/eks.amazonaws.com/serviceaccount/token
    AWS_STS_REGIONAL_ENDPOINTS=regional
    AWS_DEFAULT_REGION=us-west-2
    AWS_REGION=us-west-2
  2. Using the aws-cli on the kubernetes pod, I'm able to read and write from S3.
  3. On DuckDB I run CALL load_aws_credentials() and it sets a s3 access key, secret access key, and session token.
  4. When I attempt to read from the s3 bucket, it fails with a 403 forbidden.
  5. When I export the secret into my local CLI and run aws sts get-caller-identity, I see that DuckDB is using the IAM role of the EC2 instance the pod is running on, not the IAM role provided in AWS_WEB_IDENTITY_TOKEN_FILE which should be preferred in the AWS credential chain.

Does anyone know why DuckDB is favoring the EC2 IAM role instead of the AWS_WEB_IDENTITY_TOKEN_FILE? Or how to get it to use the AWS_WEB_IDENTITY_TOKEN_FILE as the primary IAM role?

samansmink commented 7 months ago

@cpaika thanks for reporting, the aws extension is relatively new and does not yet support more advanced control over how the credentials are loaded. It currently uses the AWS DefaultCredentialProvider chain which detects credentials in a specific order. I suspect this is whats happening here.

More fine-grained control over which credentials are loaded is definitely on the todo list here!

cpaika commented 7 months ago

Agreed its the AWS DefaultCredentialProvider which is likely the problem. What's interesting is that according to the docs, they should be resolving the AWS_WEB_IDENTITY_TOKEN_FILE in the default provider chain: https://sdk.amazonaws.com/cpp/api/aws-cpp-sdk-es/html/md_aws_cpp_sdk_es__docs__credentials__providers.html

Maybe we're using an older version here? I checked the SDK, it should be using the AWS_WEB_IDENTITY_TOKEN_FILE before the EC2 metadata.

j-hartshorn commented 7 months ago

I'm seeing this exact issue at the moment except because we have disabled use of the node credentials I'm not seeing any credentials. Is this because of an older version of the aws sdk? Sorry I'm not more experienced with cpp.

@cpaika @samansmink would it be possible to update the aws-cpp dependency to try and resolve this issue?

osalloum commented 7 months ago

I also tried to created a profile under ~/.aws/config and ~/.aws/credentials , i was able to get hard coded access keys to load from credentials

D call load_aws_credentials('some_profile', redact_secret=false);

However nothing worked to get it to use the web identity token