transferwise / pipelinewise-target-redshift

Singer.io Target for Amazon Redshift - PipelineWise compatible
https://transferwise.github.io/pipelinewise/
Other
12 stars 65 forks source link

Support AWS IAM service role #231

Open jpuris opened 1 year ago

jpuris commented 1 year ago

Context

When run on AWS services such as EC2 and Lambda, one does not require to provide AWS credentials as those can be assumed via the service metadata and role attached to the relevant service i.e. EC2.

This PR adds the boto3 session that will attempt to assume role, if no AWS credentials are found via environmental variables.

Checklist

I do not have access to the Baseline Security Requirements document 🤷

Bakrog commented 1 year ago

Why don't you let Boto3 do this configuration automatically? You could remove code and give users more configuration options as per documentation: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html#configuring-credentials

The order in which Boto3 searches for credentials is:

. Passing credentials as parameters in the boto.client() method
. Passing credentials as parameters when creating a Session object
. Environment variables
. Shared credential file (~/.aws/credentials)
. AWS config file (~/.aws/config)
. Assume Role provider
. Boto2 config file (/etc/boto.cfg and ~/.boto)
. Instance metadata service on an Amazon EC2 instance that has an IAM role configured.
jpuris commented 1 year ago

Please elaborate.

If no explicit config or env vars for AWS ((KEY & SECRET) | PROFILE) are found, only then it will default to the automatic configuration of boto3 (which falls back to what I needed - IAM role assume from AWS Service metadata)

Are you asking for removal of ability to specify the values in target config? If so, why?

Bakrog commented 1 year ago

What I mean is that you don't need this code:

if aws_access_key_id and aws_secret_access_key:
    aws_session = boto3.session.Session(
        aws_access_key_id=aws_access_key_id,
        aws_secret_access_key=aws_secret_access_key,
        aws_session_token=aws_session_token
    )
elif aws_profile:
  aws_session = boto3.session.Session(profile_name=aws_profile)
else:
  aws_session = boto3.session.Session()

just aws_session = boto3.session.Session() is already enough. Try to execute the following code in your environment (setting the needed env variables before executing it) and you'll see that it executes as expected:

import boto3

session = boto3.session.Session()

s3 = session.client('s3')
response = s3.list_buckets()

# Output the bucket names
print('Existing buckets:')
for bucket in response['Buckets']:
    print(f'  {bucket["Name"]}')
jpuris commented 1 year ago

I disagree.. what you are proposing would work just fine, however, it would also remove the ability to configure the credentials / profile name via target config file.

While I would agree that having these configurable via a clear text file is a horrible practice, this PR does not intend to solve for argument.