Open aaronsteers opened 1 year ago
I wonder if you can use https://github.com/davidmuller/aws-requests-auth with @edgarrmondragon 's recent auth changes that allow for the requests
custom auth functions to be used
@visch Yes, I was wondering something similar. There are (I think) two general paths here:
First is to use boto
for auth, but the second is to use requests
library authenticators like https://github.com/tedder/requests-aws4auth or https://github.com/davidmuller/aws-requests-auth which you reference.
From this SDK docs page, there's a reference to a custom requests
authenticator:
https://sdk.meltano.com/en/latest/code_samples.html#use-one-of-requests-s-built-in-authenticators
In addition to requests.auth classes, the community has published a few packages with custom authenticator classes, which are compatible with the SDK. For example:
- requests-aws4auth: AWS v4 authentication
- requests_auth: A collection of authenticators for various services and protocols including Azure, Okta and NTLM.
There are (I think) two general paths here
Important to call out these aren't mutually exclusive. Some use cases will probably need boto
anyway, while some others might prefer to use requests
, and (maybe??) some taps would want both? I can't think of use cases that would need both, but I can imagine some developers could prefer the requests
library and others might want to use boto
.
For reference, here's some prior art built upon boto
:
From @pnadolny13's https://github.com/MeltanoLabs/tap-cloudwatch, also built on SDK. The auth implementation should generically work for other AWS services if refactored into a generic AWS auth class.
I think the two could easily live together:
A boto-based connector class would be nice since that'd only need to expose the existing boto3 APIs, so no parsing of XML or JSON would be required.
On the other hand, something like https://github.com/tedder/requests-aws4auth could help even existing taps and targets like https://github.com/dtmirizzi/target-elasticsearch authenticate directly via AWS.
Now that we have an auth interface (i.e. any callable that accepts and returns a mutated prepared request), it might be time to start thinking of formalizing the connector interface too. That way folks would not even need to submit them to the SDK and could live as standalone packages. Later, we could still port them to be "officially" supported or just move them to our GitHub org.
I took a stab at implementing something like a boto authenticator in the DynamoDB tap that I'm working on. I made an authenticator class that manages authenticating a bunch of different ways depending on the inputs https://github.com/MeltanoLabs/tap-dynamodb/blob/main/tap_dynamodb/aws_authenticators.py. Then my dynamo implementation https://github.com/MeltanoLabs/tap-dynamodb/blob/main/tap_dynamodb/dynamo.py just simply access the clients it needs without worrying about how to auth with them i.e. self.resource
.
The challenges I commonly see are with taps/targets having varying support for auth configs (i.e. keys, session token, profile, config/credentials, environment variables, etc.) then on top of that users sometime wants custom endpoint_url while using resources that they're mocking with localstack. Also for our use case I'll need to assume another role which I havent seen anywhere but I've added it here. I put some comments on my opinions around handling configs vs env vars vs etc. in https://github.com/MeltanoLabs/tap-dynamodb/pull/3#issue-1662996922 (I've refactored again since then but the comments still stand). The TDLR is that I'd love to require the configs to be explicit to some degree. I've seen weird behavior when a tap finds credentials in my env vars or a default config file on my machine, so I think we can avoid that by being explicit.
It would be cool if we could inject all of these aws config options into the tap automatically when the authenticator is in use similar to stream_maps/stream_map_config/etc.
I am using the AWS Authenticator (AWS4AUTH). Here is a link to a working example with the Meltano SDK.
This has been marked as stale because it is unassigned, and has not had recent activity. It will be closed after 21 days if no further activity occurs. If this should never go stale, please add the evergreen
label, or request that it be added.
Just referencing the AWS Auth that I implemented in tap-rest-api-msdk if we want to bake this into the Meltano SDK.
Calling the AWS Authenticator : https://github.com/Widen/tap-rest-api-msdk/blob/f4eeb54446f181336b7c34b25821ce23b3cefeb5/tap_rest_api_msdk/auth.py#L254-L265
Definition for the AWS Authentication class: https://github.com/Widen/tap-rest-api-msdk/blob/f4eeb54446f181336b7c34b25821ce23b3cefeb5/tap_rest_api_msdk/auth.py#L17-L114
A cool idea from Matt in Slack: https://meltano.slack.com/archives/C01PKLU5D1R/p1678466724971049
We have an abstraction layer for authenticators in general, but to my knowledge we've never extended that metaphor for AWS services or other cloud services auth.
https://sdk.meltano.com/en/latest/reference.html#authenticator-classes