aws / aws-sdk-js-v3

Modularized AWS SDK for JavaScript.
Apache License 2.0
2.95k stars 554 forks source link

SDK clients not assuming the role configured in the credentials file #6189

Open ghassen-chetioui opened 2 weeks ago

ghassen-chetioui commented 2 weeks ago

Checkboxes for prior research

Describe the bug

Our ECS containers are deployed in an account A and we are setting the following credentials file to allow the sdk clients access resources in another account B.

[crossaccount]
role_arn = ***** (arn of the role in account B)
credential_source = EcsContainer

Everything works fine until at some point, the sdk clients start assuming the role of the ECS container instead of the one configured in the credential files. image

All the clients are singletons created on the application bootstrap and using the default configuration. This may seem as a problem happening on the session expiration/renew but it is really hard to prove. We encountered this issue few times now with the lambda client and the event bridge client.

SDK version number

@aws-sdk/credential-provider-node@3.565.0

Which JavaScript Runtime is this issue in?

Node.js

Details of the browser/Node.js/ReactNative version

v20.12.2

Reproduction Steps

The issue is not reproducible with code

Observed Behavior

SDK clients assuming the role of the ECS container.

Expected Behavior

SDK clients assuming the role configured in the credentials file.

Possible Solution

No response

Additional Information/Context

No response

aBurmeseDev commented 1 week ago

HI @ghassen-chetioui - thanks for reaching out.

Based off the error, it sounds like you might be missing permission to assume cross-account resource-based policy in Lambda. Here's docs page on how to grant permission to cross account and here's another docs page on working with resource-based policies in Lambda.

Hope it helps but if issue persists, please let me know.

stevehouel commented 1 week ago

Hi,

Permissions have been settled correctly as in 99% of the case it's working correctly but after some time (not fix) the lambda function is loosing those permissions and we started getting those permission denied error. After some time, everything come back and is working correctly. during this in-between we are unable to consume underlying services due to those permissions denied errors.

Maybe a bad lock during credentials renewable between ECS Containers and SDK AssumeRole?

aBurmeseDev commented 1 week ago

Appreciate you for getting back. It sounds like this occurs intermittently. We would need minimal repro code and error logs that would give us more insight on finding the root cause.