aws / aws-sdk-go-v2

AWS SDK for the Go programming language.
https://aws.github.io/aws-sdk-go-v2/docs/
Apache License 2.0
2.63k stars 634 forks source link

ExpiredToken: The provided token has expired. When using temporary credentials. #2135

Closed wave2 closed 1 year ago

wave2 commented 1 year ago

Describe the bug

When using a shared credentials file managed by another process e.g. amazon-ssm-agent the credentials are not refreshed periodically or in response to an ExpiredToken api response.

I recently observed this while using the Telegraf agent in combination with a credentials file maintained by the SSM agent. The credentials are loaded on start-up but fail to refresh when the SSM agent updates the credentials file with the new aws_session_token.

It looks like the same issue was reported here: https://github.com/aws/aws-sdk-go-v2/issues/1449

Expected Behavior

The credentials should be refreshed periodically or in response to an ExpiredToken api error.

Current Behavior

When another process updates the shared_credential_file, the credentials are not updated when making an api call and instead result in an ExpiredToken error:

2023-05-11T15:26:09Z E! [outputs.cloudwatch] Unable to write to CloudWatch : operation error CloudWatch: PutMetricData, https response error StatusCode: 403, RequestID: 4d5fbc11-8aba-489a-90b4-0966ab111ae5, api error ExpiredToken: The security token included in the request is expired

Reproduction Steps

Configure the amazon-ssm-agent to write credentials to a shared profile e.g. /var/lib/amazon/ssm/credentials and perform a periodic PutMetricData api request to CloudWatch from another process using the temporary credentials.

Example of the creation code can be seen here:

https://github.com/influxdata/telegraf/blob/master/plugins/outputs/cloudwatch/cloudwatch.go

Possible Solution

No response

Additional Information/Context

No response

AWS Go SDK V2 Module Versions Used

Observed with Go version 1.20.4 and AWS SDK version 1.18.0

Compiler and Version used

v1.20.4

Operating System and version

Red Hat Enterprise Linux Server release 7.9

RanVaknin commented 1 year ago

Hi @wave2 ,

Is there a specific reason why you are using ssm agent to modify shared credentials file? Since you are using EC2 the more obvious approach would be to use an IAM role to make those calls. You can configure your EC2 instance with an IAM execution role, and the SDK's credential chain will call the IMDS endpoint for you, assume the role, retrieve credentials and refresh them if needed.

Configure the amazon-ssm-agent to write credentials to a shared profile e.g. /var/lib/amazon/ssm/credentials and perform a periodic PutMetricData api request to CloudWatch from another process using the temporary credentials.

Are you following some docs that suggest this as the setup? If so can you add a link to it? Your repro code only has the logic for creating metrics for cloudwatch, but I'm still missing a lot of context about how you setup SSM.

The credentials should be refreshed periodically or in response to an ExpiredToken api error.

Since this really is the meat of the matter, I created a simple reproduction code to test this:

main.go

package main

import (
    "context"
    "fmt"
    "github.com/aws/aws-sdk-go-v2/aws"
    "log"
    "time"

    "github.com/aws/aws-sdk-go-v2/config"
    "github.com/aws/aws-sdk-go-v2/service/s3"
)

const IntervalInMinutes = 10

func main() {
    cfg, err := config.LoadDefaultConfig(context.TODO(), config.WithRegion("us-east-1"), config.WithClientLogMode(aws.LogRequestWithBody|aws.LogResponseWithBody))
    if err != nil {
        log.Fatal(err)
    }

    client := s3.NewFromConfig(cfg)

    for {
        out, err := client.ListBuckets(context.TODO(), &s3.ListBucketsInput{})
        if err != nil {
            panic(err)
        }
        fmt.Println(len(out.Buckets))
        fmt.Printf("Successfully listed buckets at %s\n", time.Now().String())

        time.Sleep(IntervalInMinutes * time.Minute)
    }
}

credentials file:

[default]
aws_access_key_id = FOO
aws_secret_access_key = BAR

[myrole]
role_arn = arn:aws:iam::REDACTED:role/my_role
duration_seconds = 900
source_profile = default     

In this code I am:

request and response logs:

⚠️ Please notice I added comments using the "$" symbol for formatting visibility, this is not an input to the terminal.

$ the SDK recognizes  the role assumption from the env variable and calls the STS endpoint on your behalf.

SDK 2023/05/30 14:56:12 DEBUG Request
POST / HTTP/1.1
Host: sts.us-east-1.amazonaws.com
User-Agent: aws-sdk-go-v2/1.18.0 os/macos lang/go/1.19.1 md/GOOS/darwin md/GOARCH/arm64 api/sts/1.19.0
Content-Length: 163
Amz-Sdk-Invocation-Id: REDACTED
Amz-Sdk-Request: attempt=1; max=3
Authorization: AWS4-HMAC-SHA256 Credential=REDACTED/20230530/us-east-1/sts/aws4_request, SignedHeaders=amz-sdk-invocation-id;amz-sdk-request;content-length;content-type;host;x-amz-date, Signature=REDACTED
Content-Type: application/x-www-form-urlencoded
X-Amz-Date: 20230530T215612Z
Accept-Encoding: gzip

Action=AssumeRole&DurationSeconds=900&RoleArn=arn%3Aaws%3Aiam%3A%3AREDACTED%3Arole%2FREDACTED&RoleSessionName=aws-go-sdk-REDACTED&Version=2011-06-15
SDK 2023/05/30 14:56:13 DEBUG Response
HTTP/1.1 200 OK
Content-Length: 1503
Content-Type: text/xml
Date: Tue, 30 May 2023 21:56:13 GMT
X-Amzn-Requestid: 2578b32c-0042-4b31-a164-28665cb13aae

<AssumeRoleResponse xmlns="https://sts.amazonaws.com/doc/2011-06-15/">
  <AssumeRoleResult>
    <AssumedRoleUser>
      <AssumedRoleId>REDACTED:aws-go-sdk-REDACTED</AssumedRoleId>
      <Arn>arn:aws:sts::REDACTED:assumed-role/REDACTED/aws-go-sdk-REDACTED</Arn>
    </AssumedRoleUser>
    <Credentials>
      <AccessKeyId>REDACTED</AccessKeyId>
      <SecretAccessKey>REDACTED/REDACTED</SecretAccessKey>
      <SessionToken>REDACTED...HOhK4=</SessionToken>
      <Expiration>2023-05-30T22:11:13Z</Expiration>
    </Credentials>
  </AssumeRoleResult>
  <ResponseMetadata>
    <RequestId>REDACTED</RequestId>
  </ResponseMetadata>
</AssumeRoleResponse>

SDK 2023/05/30 14:56:13 DEBUG Request
GET / HTTP/1.1
Host: s3.us-east-1.amazonaws.com
User-Agent: aws-sdk-go-v2/1.18.0 os/macos lang/go/1.19.1 md/GOOS/darwin md/GOARCH/arm64 api/s3/1.33.1
Accept-Encoding: identity
Amz-Sdk-Invocation-Id: REDACTED
Amz-Sdk-Request: attempt=1; max=3
Authorization: AWS4-HMAC-SHA256 Credential=REDACTED/20230530/us-east-1/s3/aws4_request, SignedHeaders=accept-encoding;amz-sdk-invocation-id;amz-sdk-request;host;x-amz-content-sha256;x-amz-date;x-amz-security-token, Signature=REDACTED
X-Amz-Content-Sha256: REDACTED
X-Amz-Date: 20230530T215613Z
X-Amz-Security-Token: REDACTED

SDK 2023/05/30 14:56:14 DEBUG Response
HTTP/1.1 200 OK
Transfer-Encoding: chunked
Content-Type: application/xml
Date: Tue, 30 May 2023 21:56:15 GMT
Server: AmazonS3
X-Amz-Id-2: REDACTED
X-Amz-Request-Id: REDACTED

fc5
<?xml version="1.0" encoding="UTF-8"?>
<ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/">REDACTED</ListAllMyBucketsResult>
0

33
Successfully listed buckets at 2023-05-30 14:56:14.106145 -0700 PDT m=+1.149734334

$ the SDK successfully retrieves the credentials and calls list buckets successfully.
$ ------------------------------

$  after 10 minutes it again calls list buckets (notice how it doesnt call STS again because credentials are fresh)

SDK 2023/05/30 15:06:14 DEBUG Request
GET / HTTP/1.1
Host: s3.us-east-1.amazonaws.com
User-Agent: aws-sdk-go-v2/1.18.0 os/macos lang/go/1.19.1 md/GOOS/darwin md/GOARCH/arm64 api/s3/1.33.1
Accept-Encoding: identity
Amz-Sdk-Invocation-Id: 356dbd8a-bfbf-4722-8c7e-95dd3151174b
Amz-Sdk-Request: attempt=1; max=3
Authorization: AWS4-HMAC-SHA256 Credential=REDACTED/20230530/us-east-1/s3/aws4_request, SignedHeaders=accept-encoding;amz-sdk-invocation-id;amz-sdk-request;host;x-amz-content-sha256;x-amz-date;x-amz-security-token, Signature=REDACTED
X-Amz-Content-Sha256: REDACTED
X-Amz-Date: 20230530T220614Z
X-Amz-Security-Token: REDACTED

SDK 2023/05/30 15:06:14 DEBUG Response
HTTP/1.1 200 OK
Transfer-Encoding: chunked
Content-Type: application/xml
Date: Tue, 30 May 2023 22:06:15 GMT
Server: AmazonS3
X-Amz-Id-2: REDACTED
X-Amz-Request-Id: REDACTED

fc5
<?xml version="1.0" encoding="UTF-8"?>
<ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/">REDACTED</ListAllMyBucketsResult>
0

33
Successfully listed buckets at 2023-05-30 15:06:14.668382 -0700 PDT m=+601.705641751

$ the SDK successfully re-uses the credentials and calls list buckets successfully.
$ ------------------------------

$ after an additional 10 min, the SDK will make another request to s3, now credentials are stale, notice it will call STS again to assume role. (20 min from beginning of execution)

SDK 2023/05/30 15:16:14 DEBUG Request
POST / HTTP/1.1
Host: sts.us-east-1.amazonaws.com
User-Agent: aws-sdk-go-v2/1.18.0 os/macos lang/go/1.19.1 md/GOOS/darwin md/GOARCH/arm64 api/sts/1.19.0
Content-Length: 163
Amz-Sdk-Invocation-Id: REDACTED
Amz-Sdk-Request: attempt=1; max=3
Authorization: AWS4-HMAC-SHA256 Credential=REDACTED/20230530/us-east-1/sts/aws4_request, SignedHeaders=amz-sdk-invocation-id;amz-sdk-request;content-length;content-type;host;x-amz-date, Signature=REDACTED
Content-Type: application/x-www-form-urlencoded
X-Amz-Date: 20230530T221614Z
Accept-Encoding: gzip

Action=AssumeRole&DurationSeconds=900&RoleArn=arn%3Aaws%3Aiam%3A%3AREDACTED%3Arole%2FREDACTED&RoleSessionName=aws-go-sdk-REDACTED&Version=2011-06-15
SDK 2023/05/30 15:16:15 DEBUG Response
HTTP/1.1 200 OK
Content-Length: 1503
Content-Type: text/xml
Date: Tue, 30 May 2023 22:16:14 GMT
X-Amzn-Requestid: 659fd7c6-785f-4c99-809f-d1b146a566e4

<AssumeRoleResponse xmlns="https://sts.amazonaws.com/doc/2011-06-15/">
  <AssumeRoleResult>
    <AssumedRoleUser>
      <AssumedRoleId>REDACTED:aws-go-sdk-REDACTED</AssumedRoleId>
      <Arn>arn:aws:sts::REDACTED:assumed-role/REDACTED/aws-go-sdk-REDACTED</Arn>
    </AssumedRoleUser>
    <Credentials>
      <AccessKeyId>REDACTED</AccessKeyId>
      <SecretAccessKey>REDACTED/REDACTED/x</SecretAccessKey>
      <SessionToken>REDACTED...7mxMY=</SessionToken>
      <Expiration>2023-05-30T22:31:15Z</Expiration>
    </Credentials>
  </AssumeRoleResult>
  <ResponseMetadata>
    <RequestId>REDACTED</RequestId>
  </ResponseMetadata>
</AssumeRoleResponse>
SDK 2023/05/30 15:16:15 DEBUG Request
GET / HTTP/1.1
Host: s3.us-east-1.amazonaws.com
User-Agent: aws-sdk-go-v2/1.18.0 os/macos lang/go/1.19.1 md/GOOS/darwin md/GOARCH/arm64 api/s3/1.33.1
Accept-Encoding: identity
Amz-Sdk-Invocation-Id: REDACTED
Amz-Sdk-Request: attempt=1; max=3
Authorization: AWS4-HMAC-SHA256 Credential=REDACTED/20230530/us-east-1/s3/aws4_request, SignedHeaders=accept-encoding;amz-sdk-invocation-id;amz-sdk-request;host;x-amz-content-sha256;x-amz-date;x-amz-security-token, Signature=REDACTED
X-Amz-Content-Sha256: REDACTED
X-Amz-Date: 20230530T221615Z
X-Amz-Security-Token: REDACTED

SDK 2023/05/30 15:16:15 DEBUG Response
HTTP/1.1 200 OK
Transfer-Encoding: chunked
Content-Type: application/xml
Date: Tue, 30 May 2023 22:16:16 GMT
Server: AmazonS3
X-Amz-Id-2: REDACTED
X-Amz-Request-Id: REDACTED

fc5
<?xml version="1.0" encoding="UTF-8"?>
<ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/">...</ListAllMyBucketsResult>
0

33
Successfully listed buckets at 2023-05-30 15:16:15.672573 -0700 PDT m=+1202.706455168

From the logs we can observe that:

  1. When using the AWS_PROFILE env variable to assume role, the SDK will automatically call STS and assume the role for you. This functionality should work the same way when using EC2 IAM role creds from IMDS.

  2. when credentials are stale, the SDK will attempt to retrieve new credentials on your behalf.

My suggestion would be:

  1. Use EC2 IAM role to retrieve credentials

  2. If you are confident in your workflow with ssm writing to a local file, please provide us with complete repro steps and add any documentation that you might have.

  3. For your own benefit enable client logs like I did in my code sample to check whether the SDK attempts to refresh the credentials or not. It's a very simple and useful tool that all developers should know :)

Thanks, Ran~

wave2 commented 1 year ago

Hi @RanVaknin ,

Thanks for getting back to me!

I am not currently using EC2, instead I have been using SSM in Hybrid mode to manage nodes outside of AWS.

I'm trying to make use of the Instance profile attached to those nodes in order to send metrics to CloudWatch.

The SSM Agent writes a credential file (using aws_session_token) out to a shared profile that can be consumed by other services running on the host, and as such will ensure that those credentials are rotated regularly:

https://docs.aws.amazon.com/systems-manager/latest/userguide/ssm-agent-technical-details.html#credentials-file

I just tried to simulate the same issue using your sample code, and manually rotating the credentials from the CLI. The same issue was encountered.

The code runs for the duration of the aws_session_token but fails to detect that I have refreshed the credentials file with a new token.

Steps to reproduce:

  1. Create a set of temporary credentials (Assume Role) with a lifespan of 900 seconds
  2. Update the profile (credentials file) with the Key, Secret and Token.
  3. Start the app.
  4. After 10 minutes, refresh the credentials from another session and update the profile with the new token.
  5. App will fail to detect the new credentials in instead throw the error - api error ExpiredToken: The provided token has expired.

Sample logs from scenario above:

SDK 2023/05/31 07:58:12 DEBUG Request
GET / HTTP/1.1
Host: s3.us-east-1.amazonaws.com
User-Agent: aws-sdk-go-v2/1.18.0 os/linux lang/go/1.20.4 md/GOOS/linux md/GOARCH/amd64 api/s3/1.33.1
Accept-Encoding: identity
Amz-Sdk-Invocation-Id: REDACTED
Amz-Sdk-Request: attempt=1; max=3
Authorization: REDACTED
X-Amz-Content-Sha256: REDACTED
X-Amz-Date: 20230531T075812Z
X-Amz-Security-Token: REDACTED

SDK 2023/05/31 07:58:12 DEBUG Response
HTTP/1.1 200 OK
Transfer-Encoding: chunked
Content-Type: application/xml
Date: Wed, 31 May 2023 07:58:13 GMT
Server: AmazonS3
X-Amz-Id-2: REDACTED
X-Amz-Request-Id: REDACTED

284
<?xml version="1.0" encoding="UTF-8"?>
<ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>REDACTED</ID><DisplayName>REDACTED</DisplayName></Owner><Buckets><Bucket><Name>REDACTED</Name><CreationDate>2023-05-31T07:19:33.000Z</CreationDate></Bucket><Bucket><Name>REDACTED</Name><CreationDate>2023-05-31T07:19:32.000Z</CreationDate></Bucket><Bucket><Name>REDACTED</Name><CreationDate>2023-05-31T07:19:32.000Z</CreationDate></Bucket></Buckets></ListAllMyBucketsResult>
0

3
Successfully listed buckets at 2023-05-31 07:58:12.782932942 +0000 UTC m=+0.074158514
SDK 2023/05/31 08:08:12 DEBUG Request
GET / HTTP/1.1
Host: s3.us-east-1.amazonaws.com
User-Agent: aws-sdk-go-v2/1.18.0 os/linux lang/go/1.20.4 md/GOOS/linux md/GOARCH/amd64 api/s3/1.33.1
Accept-Encoding: identity
Amz-Sdk-Invocation-Id: REDACTED
Amz-Sdk-Request: attempt=1; max=3
Authorization: REDACTED
X-Amz-Date: 20230531T080812Z
X-Amz-Security-Token: REDACTED

SDK 2023/05/31 08:08:12 DEBUG Response
HTTP/1.1 200 OK
Transfer-Encoding: chunked
Content-Type: application/xml
Date: Wed, 31 May 2023 08:08:13 GMT
Server: AmazonS3
X-Amz-Id-2: REDACTED
X-Amz-Request-Id: REDACTED

284
<?xml version="1.0" encoding="UTF-8"?>
<ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>REDACTED</ID><DisplayName>REDACTED</DisplayName></Owner><Buckets><Bucket><Name>REDACTED</Name><CreationDate>2023-05-31T07:19:33.000Z</CreationDate></Bucket><Bucket><Name>REDACTED</Name><CreationDate>2023-05-31T07:19:32.000Z</CreationDate></Bucket><Bucket><Name>REDACTED</Name><CreationDate>2023-05-31T07:19:32.000Z</CreationDate></Bucket></Buckets></ListAllMyBucketsResult>
0

3
Successfully listed buckets at 2023-05-31 08:08:12.883722797 +0000 UTC m=+600.174948371
SDK 2023/05/31 08:18:12 DEBUG Request
GET / HTTP/1.1
Host: s3.us-east-1.amazonaws.com
User-Agent: aws-sdk-go-v2/1.18.0 os/linux lang/go/1.20.4 md/GOOS/linux md/GOARCH/amd64 api/s3/1.33.1
Accept-Encoding: identity
Amz-Sdk-Invocation-Id: REDACTED
Amz-Sdk-Request: attempt=1; max=3
Authorization: REDACTED
X-Amz-Content-Sha256: REDACTED
X-Amz-Date: 20230531T081812Z
X-Amz-Security-Token: REDACTED

SDK 2023/05/31 08:18:12 DEBUG Response
HTTP/1.1 400 Bad Request
Connection: close
Transfer-Encoding: chunked
Content-Type: application/xml
Date: Wed, 31 May 2023 08:18:12 GMT
Server: AmazonS3
X-Amz-Id-2: REDACTED
X-Amz-Request-Id: REDACTED

400
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>ExpiredToken</Code><Message>The provided token has expired.</Message><Token-0>REDACTED</Token-0><RequestId>REDACTED</RequestId><HostId>REDACTED</HostId></Error>
0

panic: operation error S3: ListBuckets, https response error StatusCode: 400, RequestID: REDACTED, HostID: REDACTED, api error ExpiredToken: The provided token has expired.

goroutine 1 [running]:
main.main()
        /home/ec2-user/main.go:27 +0x31a
exit status 2

Hope that helps!

RanVaknin commented 1 year ago

Hi @wave2

Thanks for the clarification. The SDK will read the shared credentials file once (on LoadDefaultConfig) and keep a reference to it in memory. This is why updating the credentials file will not make a difference.

The Java SDK is the only SDK that has this functionality of reloading the credentials after the initial load (that Im aware of) There is a cross SDK feature request for this exact use case, however cross SDK FRs are community driven and get prioritized based on activity. I encourage you to comment / upvote on that thread so eventually this will get prioritized.

The alternative is to implement your own credential provider with this custom behavior, or change to a different credential loading strategy that can dynamically refresh credential on your behalf (this would vary based on your needs and architecture).

Since this is the intended behavior, not a bug, and there's already a feature request for it I feel confident we can close this.

If you have any additional concerns / questions / suggestions, feel free to open a discussion.

Thanks, Ran~

github-actions[bot] commented 1 year ago

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.