synfinatic / aws-sso-cli

A powerful tool for using AWS Identity Center for the CLI and web console.
https://synfinatic.github.io/aws-sso-cli/
GNU General Public License v3.0
472 stars 57 forks source link

Can't change profile as I am told I need to login but I am. #1084

Open DaveQB opened 1 week ago

DaveQB commented 1 week ago

Output of aws-sso version:

AWS SSO CLI Version 2.0.0-beta4 -- Copyright 2021-2024 Aaron Turner
1031acd4a28533e7b662d2387579786c71f04ae4 (v2.0.0-beta4) built at 2024-09-30T02:15:15+0000

Describe the bug: Can't change profile as I am told I need to login but I am.

To Reproduce:

> aws-sso-profile customer1-ManagementAccount:AWSAdministratorAccess
FATAL Must run `aws-sso login` before running `aws-sso eval`

> aws-sso login -S customer1 -u print -L trace
DEBUG loading SSO retries=10 maxBackoff=5
INFO  You are already logged in. :)

> aws-sso list -S customer1
List of AWS roles for SSO Instance: customer1 [Expires in: 52m]

AccountIdPad | AccountAlias       | RoleName               | Profile                                           | Expires
========================================================================================================================
004048383300 | Networking         | AWSAdministratorAccess | customer1-Networking:AWSAdministratorAccess        | Expired
004048383300 | Networking         | AWSReadOnlyAccess      | customer1-Networking:AWSReadOnlyAccess             | Expired
112407617619 | Extranet-Prod      | AWSAdministratorAccess | customer1-Extranet-Prod:AWSAdministratorAccess     | Expired
112407617619 | Extranet-Prod      | AWSReadOnlyAccess      | customer1-Extranet-Prod:AWSReadOnlyAccess          | Expired
174502075582 | Extranet-DR-Test   | AWSAdministratorAccess | customer1-Extranet-DR-Test:AWSAdministratorAccess  | Expired
174502075582 | Extranet-DR-Test   | AWSReadOnlyAccess      | customer1-Extranet-DR-Test:AWSReadOnlyAccess       | Expired
257963854062 | Backup-Vault       | AWSAdministratorAccess | customer1-Backup-Vault:AWSAdministratorAccess      | Expired
257963854062 | Backup-Vault       | AWSReadOnlyAccess      | customer1-Backup-Vault:AWSReadOnlyAccess           | Expired
496163505248 | Logging            | AWSAdministratorAccess | customer1-Logging:AWSAdministratorAccess           | Expired
496163505248 | Logging            | AWSReadOnlyAccess      | customer1-Logging:AWSReadOnlyAccess                | Expired
515841246183 | Extranet-Dev       | AWSAdministratorAccess | customer1-Extranet-Dev:AWSAdministratorAccess      | Expired
515841246183 | Extranet-Dev       | AWSReadOnlyAccess      | customer1-Extranet-Dev:AWSReadOnlyAccess           | Expired
635164003972 | Management Account | AWSAdministratorAccess | customer1-ManagementAccount:AWSAdministratorAccess | Expired
635164003972 | Management Account | AWSReadOnlyAccess      | customer1-ManagementAccount:AWSReadOnlyAccess      | Expired
695568135049 | Extranet-Test      | AWSReadOnlyAccess      | customer1-Extranet-Test:AWSReadOnlyAccess          | Expired
695568135049 | Extranet-Test      | AWSAdministratorAccess | customer1-Extranet-Test:AWSAdministratorAccess     | Expired
695845165018 | Audit              | AWSReadOnlyAccess      | customer1-Audit:AWSReadOnlyAccess                  | Expired
695845165018 | Audit              | AWSAdministratorAccess | customer1-Audit:AWSAdministratorAccess             | Expired
739944527790 | Sandbox            | AWSAdministratorAccess | customer1-Sandbox:AWSAdministratorAccess           | Expired
739944527790 | Sandbox            | AWSReadOnlyAccess      | customer1-Sandbox:AWSReadOnlyAccess                | Expired

> aws-sso logout -S customer1
> aws-sso login -S customer1
> aws-sso list -S customer1|head-1
List of AWS roles for SSO Instance: customer1 [Expires in: 7h 59m]
> aws-sso-profile customer1-ManagementAccount:AWSAdministratorAccess -L trace
FATAL Must run `aws-sso login` before running `aws-sso eval`

Note: You do not need to redact AWS AccountIDs from outputs or config. Per Amazon, "While account IDs, like any identifying information, should be used and shared carefully, they are not considered secret, sensitive, or confidential information."

Expected behavior: Be able to set profile

Screenshots: If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

Additional context: A aws-sso logout -S customer1 then a login to make sure, didn't help.

Contents of ~/.aws-sso/config.yaml:

SSOConfig:
    customer1:
        SSORegion: eu-central-1
        StartUrl: https://customer1.awsapps.com/start
        AuthUrlAction: print
    customer2:
        SSORegion: eu-central-1
        StartUrl: https://customer2.awsapps.com/start
        AuthUrlAction: print
    customer3:
        SSORegion: eu-central-2
        StartUrl: https://customer3.awsapps.com/start
        AuthUrlAction: print
DefaultSSO: customer2
DefaultRegion: eu-central-1
ConsoleDuration: 720
CacheRefresh: 168
Threads: 5
MaxBackoff: 5
MaxRetry: 10
AutoConfigCheck: true
UrlAction: print
ConfigProfilesUrlAction: open
LogLevel: error
HistoryLimit: 10
HistoryMinutes: 1440
ProfileFormat: "{{ .SSO }}-{{ FirstItem .AccountName (.AccountAlias | nospace) }}:{{ .RoleName }}"
AccountPrimaryTag:
    - AccountName
    - AccountAlias
    - Email
PromptColors:
    descriptionbgcolor: Turquoise
    descriptiontextcolor: Black
    inputbgcolor: DefaultColor
    inputtextcolor: DefaultColor
    prefixbackgroundcolor: DefaultColor
    prefixtextcolor: Blue
    previewsuggestionbgcolor: DefaultColor
    previewsuggestiontextcolor: Green
    scrollbarbgcolor: Cyan
    scrollbarthumbcolor: LightGrey
    selecteddescriptionbgcolor: DarkGray
    selecteddescriptiontextcolor: White
    selectedsuggestionbgcolor: DarkGray
    selectedsuggestiontextcolor: White
    suggestionbgcolor: Cyan
    suggestiontextcolor: White
ListFields:
    - AccountIdPad
    - AccountAlias
    - RoleName
    - Profile
    - Expires
FullTextSearch: true
SecureStore: file
corrylc commented 1 week ago

~Seeing this as well when using boto3:~

~Well now I am confused, because the error has vanished while trying to debug...~

botocore.exceptions.CredentialRetrievalError: Error when retrieving credentials from custom-process: level=warning msg="The specified item could not be found in the keyring"
level=fatal msg="Must run `aws-sso login` before running `aws-sso process`"

Ok, with further work I discovered that calling aws-sso process successfully on the command line enables other users (such as boto3) to function elsewhere on the system for some duration. This is why I was getting somewhat random failure/success behavior over the day.

DaveQB commented 1 week ago

aws-cli/2.17.11 Python/3.11.8 Linux/6.8.0-41-generic exe/x86_64.ubuntu.24

In case this is relevant.

DaveQB commented 1 week ago

@corrylc

No cigar here.

FATAL Must run 'aws-sso login' before running 'aws-sso process'

corrylc commented 1 week ago

@DaveQB Yeah, it's an odd little bug. I only see the issue when aioboto3 calls it, and even then only under specific circumstances I don't yet understand (it literally works in one part of the codebase, then fails a moment later). Calling process seems to resolve my specific issue, but it isn't clear why.

I don't think awscli is relevant anymore, as the bug persists when it is entirely deleted from my system.

synfinatic commented 1 week ago

Yeah, i can't seem to reproduce on macOS... Not sure why this would be Linux specific, but guess I'll have to try and see if I can repo

corrylc commented 1 week ago

My issue happens on macOS, but only deep inside a large python app, with aioboto3. I don't think I can extract a repro from this, unfortunately, though I will continue to debug it as best I can.

synfinatic commented 1 week ago

Interesting. Well the problem has something to do with the message:

Error when retrieving credentials from custom-process: level=warning msg="The specified item could not be found in the keyring"

Any chance the boto3/aioboto process is doing something like dropping privileges or something like that which would prevent it from reading ~/.aws-sso/secure/aws-sso-cli-records (note: could also be ~/.config/aws-sso/...). By default, the file is chmod'd 600 for security.

corrylc commented 1 week ago

The secure directory doesn't exist at all, presumably because of the keychain storage setting SecureStore: "keychain"

synfinatic commented 1 week ago

Sorry, yes. The file won't exist on macOS, but on Linux for @DaveQB.

But it's the same fundamental issue... except you're trying to use the macOS Keychain which can have issues if you're dropping privileges in your code.

If you want to try v1.17.0 that would be interesting... there is no dedicated aws-sso login command in previous versions- but if it is missing the necessary security token from AWS, it will automatically try to open a browser. Note: not very useful on remote systems.

corrylc commented 1 week ago

So my new theory is that this might be caused by "intense async access" in my case.

In my program the AWS credentials are accessed in two sections. One (where it works) is relatively "slow" where it iterates through various accounts. The other (where it usually fails) is some python async code where it asks aws-sso to issue credentials for 30+ accounts at almost exactly the same time.

If I have keychain open at the same time, I see the keychain entry "flickering" madly, as each parallel aws-sso invocation touches it, and I am wondering if there is some corner case where one instance makes it momentarily unreadable by another.

This is being tested against aws-sso 2.0.0-beta4

synfinatic commented 1 week ago

hmmm... possible? @corrylc can you test with v1.17? I'd like to at least get a clear understanding if this is a regression or not.

Note that v1.17 will never fail with "need to run aws-sso login" but it will instead start the login workflow so, assuming you have UrlAction: open or similar you'll see the browser start doing it's thing.

corrylc commented 6 days ago

Unfortunately I can't the setup in my environment is customized to the 2.0 line, and I am not familiar enough to flip back and forth, sorry.

corrylc commented 6 days ago

Ok, this may be a separate issue at this point, but I have tested this out, and I think there is a concurrent access bug. I don't have any reason to believe this is a regression. I specifically see this with the macOS keychain, but it could plausibly exist with other secret storage.

If using the macOS keychain, and accessing many different AWS profiles/accounts at "the same time" you will get random failures until the keychain cache contains active credentials for every account being accessed. My best guess is that any writing process has a chance to delete other processes writes, since they aren't locking/coordinating in any way.

Tested by deleting the keychain entry, then running my heavily parallel program. It fails until everything is loaded to the keychain, then works fine until expiry (explaining all the weird failures I was getting). If I delete again and modify my program to access each profile non-concurrently it works fine.

Perhaps the keychain should be split up somewhat, such that there is a separate keychain entry for each profile? Or some other locking mechanism that keeps keychain writes from stomping on each other.

synfinatic commented 6 days ago

Okay, sounds like @DaveQB your problem is different? From your description, there is no chance of multiple aws-sso processes running at the same time?

@corrylc, I definitely can believe a concurrency bug exists when accessing the secure store from multiple aws-sso processes as there is no locking in any version.

DaveQB commented 6 days ago

I do think mine is different. Mine is just trying to auth 1 customer's account. Very straight forward without any chance of a race condition of some sort.

Tried now and basically the same. Just needed to login.

Rather than this:

> aws-sso-profile customer1-ManagementAccount:AWSAdministratorAccess
FATAL Must run `aws-sso login` before running `aws-sso eval`

> aws-sso login -S customer1 -u print -L trace
DEBUG loading SSO retries=10 maxBackoff=5
INFO  You are already logged in. :)

> aws-sso list -S customer1
List of AWS roles for SSO Instance: customer1 [Expires in: 52m]

I have this workflow, but same outcome:

> aws-sso-profile customer1-ManagementAccount:AWSAdministratorAccess
FATAL Must run `aws-sso login` before running `aws-sso eval`

> aws-sso login -S customer1 -u print
        Verify this code in your browser: REDACTED
Please open the following URL in your browser:

https://device.sso.eu-central-2.amazonaws.com/?user_code=REDACTED

> aws-sso list -S customer1
List of AWS roles for SSO Instance: customer1 [Expires in: 7h 59m]