aws / amazon-ecs-agent

Amazon Elastic Container Service Agent
http://aws.amazon.com/ecs/
Apache License 2.0
2.08k stars 616 forks source link

Implement credentials chain for aws-sdk-go-v2 #4424

Closed tinnywang closed 1 week ago

tinnywang commented 2 weeks ago

Summary

This PR implements a credentials chain that is compatible with aws-sdk-go-v2. It is a prerequisite for migrating our clients to aws-sdk-go-v2 because the credentials interface has changed between v1 and v2 of the SDK, and v1 clients/credentials are not compatible with v2 clients/credentials and vice versa.

For now, we only use the v2 credential providers when fetching preflight creds during container instance registration.

Note: There may be changes in Agent behavior caused by small changes between v1 and v2 of the SDK, such as changing the order of precedence of env vars that determine config values. Unless these changes are known to break existing functionality in Agent, we will consume them as-is.

Implementation details

RotatingSharedCredentialsProviderV2

https://github.com/aws/amazon-ecs-agent/blob/ea4ffcdd5a69b36c5a7216938949fae0b26d2b0f/ecs-agent/credentials/providers/rotating_shared_credentials_provider.go#L56-L58

The SharedCredentialsProvider from v1 does not exist as a standalone credentials provider in v2. To load shared credentials in aws-sdk-go-v2, use config.LoadSharedConfigProfile.

The docs use

config.LoadDefaultConfig(
    context.TODO(),
    config.WithSharedCredentialsFiles(...),
    config.WithSharedConfigFiles(...),
)

to load shared creds and configs from non-default locations, but config.LoadDefaultConfig checks env vars before shared config and credentials files, which we do not want in this case.

InstanceCredentialsProvider

Linux and non-ECS-A Windows

This is a credentials chain that consists of the default credentials chain plus RotatingSharedCredentialsProviderV2. https://github.com/aws/amazon-ecs-agent/blob/ea4ffcdd5a69b36c5a7216938949fae0b26d2b0f/ecs-agent/credentials/instancecreds/instancecreds_linux.go#L38-L43

The default credentials chain was accessible in v1 via default.CredProviders, but it is not directly accessible in v2. To access it in v2, call config.LoadDefaultConfig and use the Config.Credentials field. Per the docs,

When you initialize an aws.Config instance using config.LoadDefaultConfig, the SDK uses its default credential chain to find AWS credentials. This default credential chain looks for credentials in the following order:

  1. Environment variables.
    1. Static Credentials (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN)
    2. Web Identity Token (AWS_WEB_IDENTITY_TOKEN_FILE)
  2. Shared configuration files.
    1. SDK defaults to credentials file under .aws folder that is placed in the home folder on your computer.
    2. SDK defaults to config file under .aws folder that is placed in the home folder on your computer.
  3. If your application uses an ECS task definition or RunTask API operation, IAM role for tasks.
  4. If your application is running on an Amazon EC2 instance, IAM role for Amazon EC2.

credentials.ChainProvider no longer exists in v2, so we reimplement it in InstanceCredentialsProvider.Retreive.

ECS-A Windows

The credentials chain tries RotatingSharedCredentialsProviderV2 before the shared credentials file. Our existing implementation accomplishes this by reordering the default credentials chain, inserting the rotating shared creds before the default shared creds.

As previously mentioned, we can't directly access the default credentials chain in v2, so we have to manually load credentials from various sources in the desired order:

  1. EnvConfig.Credentials
  2. RotatingSharedCredentialsProviderV2
  3. SharedConfig.Credentials - We load EnvConfig (again, separately from step 1) before SharedConfig in case AWS_PROFILE and AWS_SHARED_CREDENTIALS_FILE are set because LoadSharedConfigProfile does not automatically check these env vars. It only loads from the files explicitly provided as args or ~/.aws if none are provided.
  4. ec2rolecreds.Provider

Testing

New tests cover the changes: yes

Description for the changelog

Implement credentials chain for aws-sdk-go-v2.

Licensing

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

prateekchaudhry commented 2 weeks ago

Thank you for updating these. I have some general question for my own clarity:

tinnywang commented 2 weeks ago

When are rotating creds used?

Rotating creds are used with ECS-A. /root/.aws on the host is volume-mounted to /rotatingcreds on the container, and (I believe) SSM periodically rotates the creds at /root/.aws/credentials. https://github.com/aws/amazon-ecs-agent/blob/ea4ffcdd5a69b36c5a7216938949fae0b26d2b0f/ecs-init/docker/docker.go#L463-L466


What happens when credentials have expired? How do the credentials get renewed? (Is Retrieve called again?)

If the credentials provider is wrapped in a credentials cache and the creds have expired, the cache will retrieve the creds again.

If the credentials have already been retrieved, and not expired the cached credentials will be returned. If the credentials have not been retrieved yet, or expired the provider's Retrieve method will be called.

Without a credentials cache, "the SDK will attempt to retrieve the credentials for every request".

I forgot to wrap the credentials provider that fetches preflight creds in a cache. I've updated it in https://github.com/aws/amazon-ecs-agent/pull/4424/commits/039502620530de193b23d7288a039bd377b31173.


There are other places where V1 creds are being used, right? Is it okay to use both V1 and V2 creds provider simultaneously, especially because calls to both V1 and V2 Retrieve may set expiry differently?

Yes, wherever we're using an SDKv1 client, we need to use the v1 credentials provider because there's no inter-compatibility between v1 and v2 interfaces. But it's ok to mix v1 and v2 creds because they're ultimately being fetched from the same underlying source (IAM role, shared creds file, env vars, etc.). The underlying source is responsible for rotating creds when they expire, if that's something it supports. The credentials provider is responsible for retrieving creds from the underlying source when the cached creds are expired. But if the creds from the underlying source expire and are never refreshed, the provider will end up returning invalid creds.