Closed Limess closed 4 years ago
This is the exact change we need to be able to use pipelinewise. Would love to see this merged.
@Limess are you using a pipeline that uses fastsync?
For me there's still a configuration validation error, and fastsync still requires the aws_access_key_id
and the aws_secret_access_key
in the configuration.
https://github.com/transferwise/pipelinewise/blob/master/pipelinewise/fastsync/mysql_to_redshift.py#L28-L29 https://github.com/transferwise/pipelinewise/blob/master/pipelinewise/fastsync/postgres_to_redshift.py#L28-L29 https://github.com/transferwise/pipelinewise/blob/master/pipelinewise/fastsync/commons/target_redshift.py#L13-L17
I may open a separate MR to make fastsync match the behavior of this MR.
No we are not using fastsync, we're actually using this target outside of pipelinewise.
thanks for the PR and the problem reported in fastsync.
@log0ymxm there is a pending PR that addresses the same problem in fastsync-redshift at https://github.com/transferwise/pipelinewise/pull/285. Once #32 got merged we'll release a new pipelinewise-target-redshift to PyPI and will bump the version in the main pipelinewise to be everything in sync
pipelinewise-target-redshift 1.2.0 with the fix is now available on PyPI at https://pypi.org/project/pipelinewise-target-redshift/
Once https://github.com/transferwise/pipelinewise/pull/285 is merged (very soon) then it will be available in pipelinewise as well.
Sigh, this still isn't working as expected for me - it was working in my fork previously: https://github.com/Limess/pipelinewise-target-redshift/blob/do-not-pass-creds/target_redshift/db_sync.py#L218-L223 where we were still using boto3.client('s3', ...)
.
I'm still seeing this:
[2019-12-05 12:11:13,798] {{bash_operator.py:128}} INFO - boto3.exceptions.S3UploadFailedError: Failed to upload /tmp/tmp5ysju26l.csv.gz.1 to analytics-redshift-sink.my-bucket/__tmp/pipelinewise_public-my-stream_20191205-121113-233926.csv.gz.1: An error occurred (InvalidAccessKeyId) when calling the PutObject operation: The AWS Access Key Id you provided does not exist in our records.
when omitting credentials
I'll look into the differences - I rebased with the latest master and I believe the boto3 session handling changed slightly so I matched it before making this PR as it looked innocuous, I believe it may be something to do with boto3 clients vs session.
Sorry, for the repeated PRs for this, I'd like to get this fixed but it's proving more difficult than expected 🤷♂
no problem. boto3 is currently pinned to 1.9.188 which not too old but far from the latest. feel free to send PR once you found something.
Fix ECS credentials not being correctly picked up by boto3 when no keys are provided.
This is a fix I've applied in our production environment in order to use ECS credentials rather than AWS keys. Without it boto3 could not find the credentials as expected.
I'm unsure why this is necessary - I've looked through the boto3 credentials code and I can't see anything which would mean that
None
values for the keys would result in ECS credentials not being used, and I think EC2 instance credentials are picked up without this change (hence not including this tweak in the original change I made).