boto / botocore

The low-level, core functionality of boto3 and the AWS CLI.
Apache License 2.0
1.46k stars 1.07k forks source link

Support programmatic AWS SSO authentication in botocore v1 without AWS CLI v2 #1923

Open benkehoe opened 4 years ago

benkehoe commented 4 years ago

~I'm looking forward to the AWS CLI v2 allowing aws login to connect with AWS SSO. I'm also excited that the auth token retrieved by the CLI is managed by botocore, because it means scripts using the boto3 SDK can be run with the identity from aws login. However, it would be great if scripts could manage their auth directly without involving the CLI.~

With botocore 1.17, support for loading credentials cached by aws sso login has been added. This still means that Python applications cannot initiate AWS SSO auth, requiring their users to also install and understand the AWS CLI.

I'm asking for three things:

  1. Make the SSOTokenFetcher.on_pending_auth hook a provider-based system like the one for credentials.
  2. Move the browser provider from the CLI into botocore, so a script using boto3 can use browser-based authentication (I guess I would also want a provider that only ever did the "print url+code instructions" too)
  3. Add a process provider like credentials have, so that users can implement alternative auth schemes against their IdP (for example, a browserless flow on headless systems like EC2 instances).
joguSD commented 4 years ago

Moving the SSO provider into botocore is definitely on our radar, specifically the first note about being able to use the identity from an aws login. The latter suggestions in 1/2 are also definitely considerations we're thinking about as well. Right now we're letting things bake in CLI v2 for a bit to see how they go as we have the flexibility of a non-GA product in v2. Once we're comfortable that the interfaces we put into CLI v2 feel right putting them in the SDK is our intention. We don't have a concrete timeline for when this will be done though.

As for point 3, could you expand on what you mean by a process provider for alternative auth schemes? Pending what you mean this may require additional support from the SSO API.

benkehoe commented 4 years ago

I'm slightly confused—the CLI v2 is not yet GA, but the support it requires from botocore is also in a v2 branch; don't they go hand-in-hand in terms of maturity/GA timeline?

As an example for a process provider, people implement screen-scraping against ADFS to be able to prompt people for passwords inside the terminal. The AWS security blog has even featured such solutions. This would still be possible given device flow for SSO.

joguSD commented 4 years ago

To clarify, I meant that we have the intention of bringing the SSO credential provider into the current major version of botocore (v1). At the very least the ability to read the cached SSO credentials from an aws2 login, and potentially more like some of the features you're mentioning (hooks/providers for managing/logging into an sso session).

benkehoe commented 4 years ago

Checking back in on this as the CLI v2 has gone GA.

benkehoe commented 3 years ago

I'm glad the ability to read cached AWS SSO credentials has been brought in to botocore 1.17, but I'd still like to see this support grow to the original scope of this issue, which is that an SDK-using application could trigger dispatch to browser for authentication, so that the user doesn't have to also use the AWS CLI. At the very least, bringing SSOTokenFetcher in would be useful

davidfluck-tri commented 3 years ago

I would like to voice support for this as well. We're exploring AWS SSO and it would be fantastic if botocore could kick off the SSO auth flow so we don't have to either redirect users to aws sso login or re-implement a lot of botocore's existing credential-handling functionality (assuming I understand correctly).

joguSD commented 3 years ago

Following up on this now that the provider has been back ported into botocore v1.

For now, we have no immediate plans to expose the interfaces to programmatically resolve SSO credentials and store them in the shared caching location beyond the credential provider. There's a few reasons we've decided to keep these private and here are some of the highlights:

1) Given the asynchronous nature of resolving / using the SSO credentials it's potentially problematic to have multiple processes attempting to perform the login flow at once. Theoretically, this is possible even with a single tool (i.e. CLI V2) just a lot less likely given that the login flow is explicitly invoked rather than some implicit attempt to refresh. 2) Currently, SSO login attempts need to be explicitly requested by the user (via CLI V2) and not implicitly invoked as part of an attempt to use those credentials. This has the benefit of vastly simplifying the credential provider, as the process of performing an SSO login is not an implicit part of being a compliant provider. Additionally, this falls more inline with the guidance in the device authorization RFC:

To avoid unneeded requests on the token endpoint, the client SHOULD only
commence a device authorization request when prompted by the user and not
automatically, such as when the app starts or when the previous authorization
session expires or fails.

3) Individual tools concerned with using a profile configured to use SSO credentials all ensuring the underlying login token is up to date and prompting for login flows seems like an anti-pattern. As the credentials are shared, any process using said credentials will all attempt to initiate the login flow without the knowledge of any other tool attempting to do so. The end result is the user being prompted by multiple tools to complete the login, which seems annoying at best, and hostile/broken at worst. By adding and supporting public interfaces for this in the SDK we're promoting scenarios where multiple tools think they're responsible for refreshing the login session. Ultimately, we want to limit the number of tools concerned with performing this logic.

Realistically, only one process needs to refresh the underlying SSO login session to ensure the credential provider can exchange for credentials using the shared cached session token. That being said, I definitely agree that there are improvements to be made on the current workflow. Ideally, in my mind this would be some kind of daemon that monitors login sessions and can notify the user when their credentials may be close to expiration to refresh the login session (e.g. via notifications, the task/menu bar, etc).

As a question, is there anything specifically about using the CLI v2 to perform the SSO login that's a hard blocker?

benkehoe commented 3 years ago

All of these concerns make sense.

The problem with relying on the CLI v2 to perform SSO login is that a program a developer creates may not have CLI users as its audience (e.g., a GUI application), and even if it does, the AWS CLI v2 has an idiosyncratic install process that will be a completely divergent step in the program setup process.

I'd be happy with an AWS SSO daemon, especially if its installer was more friendly for users less familiar with the CLI. I'd be happy to help build such a thing, in fact.

davidfluck-tri commented 3 years ago

Thanks for your response, @joguSD, it makes a lot of sense. I'm not blocked at the moment, no.

I would also be curious to at least follow along with development of a daemon (if not contribute a PR or two).

benkehoe commented 3 years ago

@joguSD I made a prototype of a GUI application that refreshes AWS SSO tokens. https://github.com/benkehoe/aws-sso-login-gui

0xW1sKy commented 3 years ago

@joguSD @benkehoe just in case this helps, here is my method for solving this kind of issue. i include expiration timestamp in the storage with the token, could check that timestamp to alert for expiration: https://github.com/0xW1sKy/PowerShellTools/blob/master/Reference/Get-AWSSSORoleCredential.py

benkehoe commented 3 years ago

@0xW1sKy I have written aws-sso-lib for programmatic AWS SSO functionality, and aws-sso-util for things like generating profiles for all AWS SSO account+roles. Your script looks like it writes to ~/.aws/credentials as well, which I have aws-export-credentials for (that works with all kinds of AWS credentials)

semanur-prenuvo commented 1 year ago

@benkehoe I had been looking for ways to bring up our local development stacks with our SSO users. aws-sso-lib looks very useful. I wonder how legit is this package if we install it with pip since this package will access to a very critical files with the tokens under the sso/cache

benkehoe commented 1 year ago

All code you use that uses boto3 reads those files. The only unique thing aws-sso-lib is doing is writing to those files, like the AWS CLI does in aws sso login. But if you run any AWS API call, like this:

import boto3
boto3.client('sts').get_caller_identity()

and that's relying on a profile configured to use AWS SSO, that will load the token from ~/.aws/sso/cache.

That's not to dismiss concerns of repository compromise, but I don't think aws-sso-lib is above average in such likelihood. I'm excited about things like package provenance that could help secure more steps in the software supply chain, but there's still a ways to go.

Happy to have a longer discussion over on the repo if you want.

semanur-prenuvo commented 1 year ago

@benkehoe that's exactly how simple it is to compromise the token. Package Provenance sounds promising to increase trust in npm packages.

Public repositories have become a handy instrument for malware distribution like token grabbers and env variable stealers. My question is here again if we can trust in aws-sso-lib