okigan / awscurl

curl-like access to AWS resources with AWS Signature Version 4 request signing.
MIT License
755 stars 94 forks source link

Does not support EC2 Instance Profiles #55

Closed tinyzimmer closed 5 years ago

tinyzimmer commented 5 years ago

First off, very neat project!

In regards to func load_aws_config. Is there a specific reason you use a ConfigParser?

If you used boto3 you'd end up supporting most credential providers implicitly.

boto3.Session().get_credentials().get_frozen_credentials()

You could construct the session object with a profile and/or keys if provided, or otherwise allow boto to default to its own chain of finding credentials. This would in turn allow it to work from an EC2 instance with an IAM instance profile while maintaining the existing functionality.

david-house-harvard commented 5 years ago

We would really like to be able to use this project, but the lack of support for instance profiles means that we are going to end up using https://www.npmjs.com/package/aws-es-curl

tinyzimmer commented 5 years ago

It was simple enough to do on a fork real quick, should you want to play with it: https://github.com/tinyzimmer/awscurl

I basically just added a --discover-credentials switch. So the default is to use the ConfigParser, but with that switch it'll just follow the boto3 hierarchy instead.

$> awscurl --service es --discover-credentials ...

Though in my head I feel like that should just be the default method 😛. Since if a credentials file is what they got then that's what it uses and the get_credentials() will just give you those.

david-house-harvard commented 5 years ago

@okigan Would you be open to a PR to enable support for EC2 instance profiles?

tinyzimmer commented 5 years ago

If we go full PR, I'm inclined to take out the ConfigParser entirely, unless there is in fact a specific reason it's used instead. Because you could still support the command line switches such as --profile and all you need to do is pass those **kwargs to the boto3.Session().

The only reason I can see for avoiding it, is it's a chunky dep that gets updated frequently sometimes breaking backwards compatibility. But the credentials stuff has been the same since I started using AWS 4 or 5 years ago. I don't suspect it'll change anytime soon.

EDIT: Though I dunno if @okigan is still even working on this. I was just googling around and came upon this - https://github.com/DavidMuller/aws-requests-auth - which is basically a python requests adapter that does the same thing, and can do boto discovery. This still has value just for the curl like interface, but using a library like that would render 60% of the code in here worthless.

okigan commented 5 years ago

yes the reason is to avoid boto dependency - i feel this is conflicts with the scope of the awscurl (if you have boto just use boto directly, no?)

okigan commented 5 years ago

@tinyzimmer check out this PR as well, maybe botocore is more acceptable dependency: https://github.com/okigan/awscurl/pull/54/files

tinyzimmer commented 5 years ago

Yea somehow I completely missed that before. It's the same idea, a tad more lightweight to just use botocore.

Perhaps merging the two ideas and making it an optional install_requires? i think that's possible?

Then you could use a --boto flag or something to use that method and enable things like instance profiles. Wrap the import in a try/catch like I did on my fork (but with botocore) so that the whole thing doesn't bork if it's not there.

okigan commented 5 years ago

@tinyzimmer might just get way with doing this: https://gist.github.com/iMilnb/ab9939e83168d6df6457e50b0ca73c78

tinyzimmer commented 5 years ago

Yea that works too, only downside would be if they make a change to the metadata apis. Botocore would have the benefit of having upstream support for any changes like that (probably).

okigan commented 5 years ago

@tinyzimmer hmm, i think this story will continue to evolve, for now created this PR (fallback to botocore... 😱): #63

david-house-harvard commented 5 years ago

While I definitely see the argument against instilling a loose dependency on boto, I think adding the dependency makes a lot of sense for our use case. In our migration from self-managed Elasticsearch to AWS-managed Elasticsearch we found that specific REST API calls against the managed service required AWS request authentication. This broke some of our tooling, which only uses curl. In order for us to reduce the change scope, we needed a tool like awscurl.

While we already have access to boto on the instance, the effort required to migrate to managed Elasticsearch combined with what would be an effort to rewrite our tooling in Python was too great given time constraints.

Without the support of instance profiles, it ended up being easier to say "let's just install node so we can use aws-es-curl and deal with this when we're ready to rewrite our data management tooling. Support for instance profiles would be a huge help to us. Thank you so much for putting out the PR! I'll be tracking this closely :).

tinyzimmer commented 5 years ago

Lol I just use IP based policies to get around that in my current setup. It's only accessible in the VPCs and over our VPN. But that way you don't require request signing and can use curl normally. Not saying it's ideal, but it's an option. You could slap a reverse proxy in front of it for basic auth.

On Wed, Jun 12, 2019, 6:26 AM David House notifications@github.com wrote:

While I definitely see the argument against instilling a loose dependency on boto, I think adding the dependency makes a lot of sense for our use case. In our migration from self-managed Elasticsearch to AWS-managed Elasticsearch we found that specific REST API calls against the managed service required AWS request authentication. This broke some of our tooling, which only uses curl. In order for us to reduce the change scope, we needed a tool like awscurl.

While we already have access to boto on the instance, the effort required to migrate to managed Elasticsearch combined with what would be an effort to rewrite our tooling in Python was too great given time constraints.

Without the support of instance profiles, it ended up being easier to say "let's just install node so we can use aws-es-curl and deal with this when we're ready to rewrite our data management tooling. Support for instance profiles would be a huge help to us. Thank you so much for putting out the PR! I'll be tracking this closely :).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/okigan/awscurl/issues/55?email_source=notifications&email_token=AJFREM34DKJZROSHNCVGRA3P2D2PFA5CNFSM4G2LGUM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXQM4KQ#issuecomment-501272106, or mute the thread https://github.com/notifications/unsubscribe-auth/AJFREM6MCTCST7LFBLAIKK3P2D2PFANCNFSM4G2LGUMQ .

david-house-harvard commented 5 years ago

@tinyzimmer Have you tried configuring s3 snapshots? I think we have a pretty similar setup, the cluster is only available in the VPC and a security group grants access to our EC2 Instances and our VPN. We are able to perform regular curl requests for all REST endpoints except for PUT on the _snapshot endpoint, which requires a role_arn authorized to perform operations against an s3 bucket.

See: https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/es-managedomains-snapshots.html#es-managedomains-snapshot-registerdirectory

tinyzimmer commented 5 years ago

That I have not. But I might give it a try later to see. That would make sense since it hits an extra service. My cluster is only used for logging and I nuke everything older than 30 days in it and use CloudWatch logs for "cold storage." So I haven't really worried about backups or anything.

okigan commented 5 years ago

Resolved in https://github.com/okigan/awscurl/pull/63

okigan commented 5 years ago

release 0.17 is on pypi - closing this thread - open new one for further developments