aws / aws-sdk-cpp

AWS SDK for C++
Apache License 2.0
1.98k stars 1.06k forks source link

IMDSv2 token not being handled properly #2823

Closed agabhin closed 7 months ago

agabhin commented 9 months ago

Describe the bug

Experiencing ExpiredToken error message intermittently. We are using IMDSv2 with S3Client caching. We maintain a pool of 100 S3Clients and reuse them. The http error code returned is 400 and not one of 5xx. The clients are configured as follows:

Aws::Client::ClientConfiguration client_config
// client_config sets the following fields:
// region
// connectTimeoutMs = 10000
// requestTimeoutMs = 60000
// maxConnections = 100
Aws::S3::S3Client("ALLOC_IAM",
                    client_config);

Expected Behavior

SDK should handle token management internally.

Current Behavior

AWS ERROR: ExpiredToken: Unable to parse ExceptionName: ExpiredToken Message: The provided token has expired.. HTTP status code:  400. AWS Error Type:  100

Reproduction Steps

NA

Possible Solution

Creating new S3Client sometimes seems to help.

Additional Information/Context

No response

AWS CPP SDK version used

1.8.187

Compiler and Version used

g++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0

Operating System and version

Ubuntu 20.04.4 LTS

jmklix commented 9 months ago

Can you update the the current version of this sdk? 1.8 is a very old version which might be causing the errors you are seeing with IMDSv2 tokens.

agabhin commented 9 months ago

SDK upgrade is very difficult because the issue is only happening on a customer cluster and we are unable to reproduce it locally. One mitigation we have figured out so far is to restart the application - which would clear out the SDK state. We have seen the issue persist for 2 days and then it resolves immediately upon restart.

jmklix commented 9 months ago

Why exactly is it difficult to upgrade to the latest version of this sdk? If this isn't already fixed in the latest version of this sdk then any additional fixes would be added to the next minor version of this sdk and would require the customer to update anyway. Is there a specific problem with upgrading that I might be able to help with?

agabhin commented 9 months ago

We are going to upgrade the SDK. We suspect that the issue is due to EC2 giving expired token. This is the fix https://github.com/aws/aws-sdk-cpp/commit/c97a7bd4a302fd87a7bf056995bc8910b1329798

github-actions[bot] commented 9 months ago

Greetings! It looks like this issue hasn’t been active in longer than a week. We encourage you to check if this is still an issue in the latest release. Because it has been longer than a week since the last update on this, and in the absence of more information, we will be closing this issue soon. If you find that this is still a problem, please feel free to provide a comment or add an upvote to prevent automatic closure, or if the issue is already closed, please feel free to open a new one.

agabhin commented 9 months ago

What is the recommended version for the SDK? We tried upgrading to 1.11.253, but there is around 5-10% performance regression in write performance with S3Client. However, the performance seems to be fine with v1.11.0.

agabhin commented 9 months ago

@jmklix What is the recommended version for the SDK? We tried upgrading to 1.11.253, but there is around 5-10% performance regression in write performance with S3Client. However, the performance seems to be fine with v1.11.0.

jmklix commented 9 months ago

We generally recommend using the latest version of the sdk, which is 1.11.263. But I am interested in investigating further into the performance regression that you are seeing. How exactly are you testing this? A minimal code sample would be best.

github-actions[bot] commented 7 months ago

Greetings! It looks like this issue hasn’t been active in longer than a week. We encourage you to check if this is still an issue in the latest release. Because it has been longer than a week since the last update on this, and in the absence of more information, we will be closing this issue soon. If you find that this is still a problem, please feel free to provide a comment or add an upvote to prevent automatic closure, or if the issue is already closed, please feel free to open a new one.