aws sso get-role-credentials fails one hour after authentication

pbrisbin commented 1 year ago

Describe the bug

I'm looking to re-open https://github.com/aws/aws-cli/issues/4987 because I'm experiencing this exact issue.

Over the past two weeks it's been intermittent. Today it is now consistently reproducible.

Expected Behavior

It should work.

Current Behavior

An error occurred (UnauthorizedException) when calling the GetRoleCredentials operation: Session token not found or invalid

Reproduction Steps

Perform aws sso login. Wait an hour or so, but still be within the expiresAt.

Verify this:

% jq -r '.expiresAt' ~/.aws/sso/cache/<redacted>.json
2022-10-13T23:15:46Z
% date --utc
Thu Oct 13 06:59:15 PM UTC 2022

Use this token to try and get-role-credentials:

aws sso get-role-credentials --role-name <redacted> --account-id <redacted> --access-token "$(jq -r '.accessToken' ~/.aws/sso/cache/<redacted>.json)"

Observe the error.

Possible Solution

In the original issue it is stated

I would suggest following up with the SSO Service Team

Which makes total sense. So my new question is... How do I do that?

Additional Information/Context

No response

CLI version used

aws-cli/2.8.2 Python/3.9.11 Linux/5.19.9-arch1-1 exe/x86_64.arch prompt/off

Environment details (OS name and version, etc.)

Arch Linux; Linux prince 5.19.9-arch1-1 _1 SMP PREEMPT_DYNAMIC Thu, 15 Sep 2022 16:08:26 +0000 x86_64 GNU/Linux

pbrisbin commented 1 year ago

FWIW, my Googling also found this issue in an unofficial wrapper project in which a workaround was implemented.

Back on Feb 15, someone comments:

... the problem with accessToken 7-day validity reported on AWS API was caused by a confirmed bug in AWS.

According to AWS support:

"code bug that was deployed to some commercial AWS regions starting from 5am EST on 25 January ... The change was reverted at 11am EST on February 9".

The actual validity of obtained token remained 8 hours (and cannot be changed by any means).

But there is no link given as to where they heard that.

tim-finnigan commented 1 year ago

Hi @pbrisbin thanks for reaching out. Which region are you using? I'm not sure if this is related but I saw a recent post reporting SSO issues as well.

To contact the SSO team I suggest creating a technical support case through AWS Support. That is the recommended way to escalated a service-related issue and open a direct line of communication.

I hope that helps - please keep us posted on your findings. If you do think the issue is specific to the CLI then we'd request to view your debug logs for further investigation.

pbrisbin commented 1 year ago

Got it. I'm in us-east-1. I'll open a Support ticket and report back.

rbeede commented 1 year ago

Is this possibly caused by role chaining limits of 1 hour?

https://aws.amazon.com/premiumsupport/knowledge-center/iam-role-chaining-limit/#:~:text=You%20can%20use%20role%20chaining,and%20can't%20be%20increased.

pbrisbin commented 1 year ago

Anything's possible, but I don't believe I'm technically using the AssumeRole API call, nor am I getting the (much clearer!) error shown on that help page.

I'm also pretty sure I can reproduce the error with a simpler aws sso list-accounts command. I have to wait another hour to confirm since I did the aws sso login dance recently. If so, that seems very far away from assume-role to me.

tim-finnigan commented 1 year ago

I was notified of another internal ticket tracking issues with SSO cached tokens. It referenced the Terraform issue mentioned here: https://github.com/aws/aws-cli/issues/7104#issuecomment-1277563486. That investigation is currently ongoing but I can share updates here.

pbrisbin commented 1 year ago

Alright, Support ticket submitted. Also just want to confirm 2 things:

1- It happens at exactly 1 hour:

Just before...

% echo -n "Now: "; date --utc; echo -n "Expires: "; jq -r .expiresAt ~/.aws/sso/cache/133454eea5510d1526a59d852338cd457befecd0.json; echo "Attempt:"; aws sso get-role-credentials --role-name AWSAdministratorAccess --account-id 853032795538 --access-token "$(jq -r '.accessToken' ~/.aws/sso/cache/133454eea5510d1526a59d852338cd457befecd0.json)"
Now: Thu Oct 13 08:27:49 PM UTC 2022
Expires: 2022-10-14T03:30:48Z
Attempt:
{
    "roleCredentials": {
        "accessKeyId": "...",
        "secretAccessKey": "...",
        "sessionToken": "...",
        "expiration": 1665736069000
    }
}

Just after...

% echo -n "Now: "; date --utc; echo -n "Expires: "; jq -r .expiresAt ~/.aws/sso/cache/133454eea5510d1526a59d852338cd457befecd0.json; echo "Attempt:"; aws sso get-role-credentials --role-name AWSAdministratorAccess --account-id 853032795538 --access-token "$(jq -r '.accessToken' ~/.aws/sso/cache/133454eea5510d1526a59d852338cd457befecd0.json)"
Now: Thu Oct 13 08:31:11 PM UTC 2022
Expires: 2022-10-14T03:30:48Z
Attempt:

An error occurred (UnauthorizedException) when calling the GetRoleCredentials operation: Session token not found or invalid

2- It happens with commands that aren't (AFAIK) assuming anything:

% aws sso list-accounts --access-token "$(jq -r '.accessToken' ~/.aws/sso/cache/133454eea5510d1526a59d852338cd457befecd0.json)"

An error occurred (UnauthorizedException) when calling the ListAccounts operation: Session token not found or invalid

If you do think the issue is specific to the CLI

I'm pretty confident it's not specific to the CLI. I have a Haskell App and its SDK (Amazonka) hits the exact same error at the exact same point when manually calling these APIs to do the same operations. At this point, I'm just going to leave a paper trail for the next person (once Support gets back to me). I appreciate the help anyway!

Xeroday commented 1 year ago

Also seeing this in us-east-2. Not just aws cli, boto3 seems to be affected as well.

ethanlieske commented 1 year ago

I can confirm seeing this in us-west-2, sso token shows 8 hours till expiresAt but aws command line throws

aws --profile someProfile sts get-caller-identity

The SSO session associated with this profile has expired or is otherwise invalid. To refresh this SSO session run aws sso login with the corresponding profile.

tim-finnigan commented 1 year ago

I received an update that the team has identified the issue and is currently working on a fix. I don't have an ETA for that but please check back in here for more updates.

caviliar commented 1 year ago

We have this issue also. Looking forward to the fix. With 100's of profiles and tooling it is frustrating to have to re-login every hour.

region: eu-central-1

rcousens commented 1 year ago

I am experiencing this too and it's a little frustrating. After checking we haven't had any policy updates that might affect the SSO session length my CLI workflow is kinda broken now as when I start some kind of operation/work my token regularly expires after 1 hour and I have to do aws sso login again. Can confirm the expiry in the cached JSON SSO file shows 8 hours.

Previously I could just set my profile with the environment variable and get all my work done for the day without logging in regularly.

Commenting so I can get notification when this is fixed!

mendickmorningstar commented 1 year ago

I meet the same issue, now the AWS has a DEV Day in China. But we met an issue.

An error occurred (UnauthorizedException) when calling the ListAccounts operation: Session token not found or invalid

roman-vynar commented 1 year ago

SSO us-west-2. It was working okay until October ~10th or so. I have 8 hours duration on permission set. Now it is broken and it fails every hour requiring you to do aws sso login.

davidindrawesHA commented 1 year ago

experiencing the same issue on eu-west-2 started noticing it from 13th of October 2022

logged in using aws sso login the cached token says expires at 2022-10-14T15:22:58Z now it's 2022-10-14T09:01:50Z and already getting the error The SSO session associated with this profile has expired or is otherwise invalid. To refresh this SSO session run aws sso login with the corresponding profile.

My permissionset is set for 8 hours hence the expiresAt value is correct but actually the token expires in about an hour

icollar commented 1 year ago

I don't think any of the expiresAt values in ~/.aws/sso/cache are either correct/honoured or maybe not set correctly:

jq .expiresAt ~/.aws/sso/cache/*
"2022-10-14T16:43:16Z"
"2022-10-19T11:06:21Z"

I've just set a while loop running every 30s with the following, and my session expired at 2022-10-14T10:14:12Z... ran for around 2 hours, and the expiry time doesn't seem close to the cache expiry..

while true ; do ; sleep 30 ; date -u '+%Y-%m-%dT%H:%M:%SZ' && aws sts get-caller-identity ; done

Prior to this week, it was the expiry seemed to be relatively close to the cache expiresAt.

pbrisbin commented 1 year ago

Sounds like this issue is pretty wide-spread, and AWS is working on a fix. I got this response in my own Support ticket:

[SSO] customers who are using the CLI integration have to re-authenticate every hour. We are aware of this issue and are working on a fix.

roman-vynar commented 1 year ago

Surprisingly it is working today, for 3.5 hours so far. It hasn't worked yesterday for longer than 1hr.

fmquaglia commented 1 year ago

It seems it's working as intended today.

skyzyx commented 1 year ago

I've been running into this issue while using AWS Vault as my primary way to retrieve tokens from AWS SSO in us-east-1. My company has over 160 AWS accounts, and I access many of those accounts weekly.

This has mostly been happening over the last week or two. AWS Vault stores the OIDC token used by AWS SSO in the system keychain. I just open Keychain Access.app, delete the OIDC entry, and re-auth through SSO.

We can definitely work around it. It's just super annoying when I'm trying to get work done and auth is unreliable.

arohter commented 1 year ago

I can confirm things are working correctly again (8h ttl) for us in us-west-2 SSO.

tim-finnigan commented 1 year ago

Thanks all for checking in here. The SSO team has informed us that they are deploying the fixes at an accelerated pace which should cover most commercial regions by the end of today. They are hopeful that this will be fixed globally across all regions by Monday. @aBurmeseDev will plan to follow up here next week with any more info.

august-9426 commented 1 year ago

It seems that accessToken is padded with commas at the end of [hash].json

aBurmeseDev commented 1 year ago

Hi all,

I'm happy to update you that the service team has pushed a fix for this globally and this should now be working as expected.

We appreciate everyone for checking in here and please let us know if you need anything else!

Best, John

pbrisbin commented 1 year ago

Just confirming:

My AWS Support ticket was closed as fixed
I'm seeing the token still work going on 3 hours now

I'd be fine with this Issue closing now, but I'll leave it to the AWS team to do so, since there's been so many reports besides myself coming in. Thanks for the quick response!

tim-finnigan commented 1 year ago

I'm going to close this issue since no one has reported issues since the fix was deployed. If anyone is still having issues related to SSO we recommend reaching out through AWS Support for further assistance. Thanks!

github-actions[bot] commented 1 year ago

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see. If you need more assistance, please open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.

aws / aws-cli