aws / aws-sdk-js-v3

Modularized AWS SDK for JavaScript.
Apache License 2.0
3.12k stars 579 forks source link

DefaultCredentialsProvider not using maxAttempts configuration #4328

Open GovernmentHack opened 1 year ago

GovernmentHack commented 1 year ago

Checkboxes for prior research

Describe the bug

The AWS SQS Client does not retry to get credentials when maxAttempts is set in the initial client configuration.

I believe the DefaultProvider credentials provider from the credential-provider-node which is used in the SQS Client does not use the maxAttempts configuration parameter, and therefore defaults to 0 retries always.

From digging through the source code, I believe that when the defaultCredentialsProvider (initially set here) is accessed (used here), and given the initial configuration (see the init value passed through the chain), the configured maxAttempts as a parameter for how many times to retry (defined as a type here) is not used.

Instead, the parameter maxRetries from the providerConfigFromInit() function (here) is used for retries in the credentials provider. Since this parameter is not set, nor part of the type definition, it is then defaulted to the default value of 0 (default and fallback logic here), therefore the credentials provider is never retried.

SDK version number

@aws-sdk/client-sqs@3.231.0

Which JavaScript Runtime is this issue in?

Node.js

Details of the browser/Node.js/ReactNative version

v18.12.1

Reproduction Steps

* Reproduction is tricky, as it requires a cloud environment (node server hosted on ECS, with credentials provided by IAM), and the credentials provider to be "flakey", (e.g. it fails a request for credentials once every so often)

Simply make an SQS client request like such:

const sqsClient = new SQSClient({
  endpoint: MY_ENDPOINT,
  region: MY_REGION,
  retryMode: "adaptive",
  maxAttempts: 5,
  requestHandler: getRequestHandler(),
  credentialDefaultProvider: (input: any) => fromNodeProviderChain({ ...input, maxRetries: 3 }),
});

const receiveMessageRequestPayload: ReceiveMessageCommandInput = {
  QueueUrl: MY_QUEUE_URL,
  MaxNumberOfMessages: 1,
};
const receiveMessageRequest = new ReceiveMessageCommand(receiveMessageRequestPayload);
const sqsMessageResponse = await sqsClient.send(receiveMessageRequest);

Observed Behavior

When we get the transient errors, they are as follows:

error: {
     $metadata: {
       attempts: 1
       totalRetryDelay: 0
     }
     message: Could not load credentials from any providers
     name: CredentialsProviderError
     stack: CredentialsProviderError: Could not load credentials from any providers
    at /usr/src/service/node_modules/@aws-sdk/credential-provider-node/dist-cjs/defaultProvider.js:13:11
    at /usr/src/service/node_modules/@aws-sdk/property-provider/dist-cjs/chain.js:11:28
    at processTicksAndRejections (node:internal/process/task_queues:95:5)
    at coalesceProvider (/usr/src/service/node_modules/@aws-sdk/property-provider/dist-cjs/memoize.js:14:24)
    at SignatureV4.credentialProvider (/usr/src/service/node_modules/@aws-sdk/property-provider/dist-cjs/memoize.js:33:24)
    at SignatureV4.signRequest (/usr/src/service/node_modules/@aws-sdk/signature-v4/dist-cjs/SignatureV4.js:86:29)
    at /usr/src/service/node_modules/@aws-sdk/middleware-signing/dist-cjs/middleware.js:16:18
    at /usr/src/service/node_modules/@aws-sdk/middleware-retry/dist-cjs/retryMiddleware.js:27:46
    at /usr/src/service/node_modules/@aws-sdk/middleware-sdk-sqs/dist-cjs/receive-message.js:7:22
    at /usr/src/service/node_modules/@aws-sdk/middleware-logger/dist-cjs/loggerMiddleware.js:5:22
     tryNextLink: false
   }

Expected Behavior

We expect more than 1 attempt to be made in trying to get credentials.

Ideally an attempt would be successful. But if none were, we'd still see the error.$metadata.attempts value to be 5 given our maxAttempts configuration.

Possible Solution

We have been able to overcome this issue via 2 work-arounds:

  1. We can force a retry of the request based on that error like such:
    } catch (error) {
    if (error.name === CREDENTIALS_PROVIDER_ERROR) {
    // Try the request again
  2. We can manually define the credentials provider with a wrapper that injects the correct configuration parameter:
    
    import { fromNodeProviderChain } from "@aws-sdk/credential-providers";

const sqsClient = new SQSClient({ endpoint: MY_ENDPOINT, region: MY_REGION, maxAttempts: 5, credentialDefaultProvider: (input: any) => fromNodeProviderChain({ ...input, maxRetries: 3 }), });



### Additional Information/Context

Workaround was inspired by [issue#3776](https://github.com/aws/aws-sdk-js-v3/issues/3776)
RanVaknin commented 1 year ago

Hi @GovernmentHack ,

The client will only attempt a retry if the error that is being thrown is a Retryable Error. Client errors are not retryable, so it doesn't matter which number of retries you have there. In your first workaround, you are effectively forcing the retryer to treat CREDENTIALS_PROVIDER_ERROR as a retryable error.

Is there a specific reason of why you are setting your credential provider this way? Why not just configure your ~/.aws/credentials file and let the SDK credential chain resolve those automatically?

Thanks, Ran~

GovernmentHack commented 1 year ago

Why not just configure your ~/.aws/credentials file and let the SDK credential chain resolve those automatically?

Our production environment provides the credentials from an IAM policy, and by using the fromNodeProviderChain we're still essentially using the provider chain, while "injecting" the missing parameter which typescript otherwise blocks us from setting.

The client will only attempt a retry if the error that is being thrown is a Retryable Error

I've seen that logic with the retry middleware, but it seems the credentials provider doesn't use that. I believe ultimately the remoteProvider() is called from the provider chain based on our usage of IAM policies, which will call the fromInstanceMetadata() provider, right? That provider doesn't seem to use the same retryable logic used by the "@aws-sdk/service-error-classification" package, which is what the middleware-retry package seems to use. It instead uses this retry function code snippet from the getInstanceImdsProvider() used by fromInstanceMetadata() code:

import { retry } from "./remoteProvider/retry";

// ...

  const getCredentials = async (maxRetries: number, options: RequestOptions) => {
    const profile = (
      await retry<string>(async () => {
        let profile: string;
        try {
          profile = await getProfile(options);
        } catch (err) {
          if (err.statusCode === 401) {
            disableFetchToken = false;
          }
          throw err;
        }
        return profile;
      }, maxRetries)
    ).trim();

    return retry(async () => {
      let creds: AwsCredentialIdentity;
      try {
        creds = await getCredentialsFromProfile(profile, options);
      } catch (err) {
        if (err.statusCode === 401) {
          disableFetchToken = false;
        }
        throw err;
      }
      return creds;
    }, maxRetries);
  };
sam-super commented 1 year ago

Just had the same issue with "@aws-sdk/client-sqs": "^3.245.0" (thanks to @RanVaknin for the code to manually setup the node provider - worked a treat!).

For more context: we are using ECS, and i assume the credentials provider chain needs to be able to retry because the container can't initially access the aws endpoint it needs to get it's auth tokens. Could be considered a bug in ECS for that reason, but this is working for us in other v3 SDKs (and v2 in general).

arash-cid commented 9 months ago

Same issue with

"@aws-sdk/client-secrets-manager": "^3.496.0"