aws / aws-sdk-net

The official AWS SDK for .NET. For more information on the AWS SDK for .NET, see our web site:
http://aws.amazon.com/sdkfornet/
Apache License 2.0
2.05k stars 852 forks source link

SDK does not seem to support EKS IAM for service accounts #1413

Closed JonCubed closed 4 years ago

JonCubed commented 4 years ago

I'm trying to get a .NET Core app to work with EKS new support for IAM for Service Accounts. I've followed these instructions .

This app is reading from an SQS queue and was working previously with kiam. AWSSDK has been updated to the latest stable which is newer than the minimum supported version specified here. We have a Java app that is working with IAM for Service Accounts so don't think this is an issue with setup.

My understanding is that a token which is a kubernetes secret is mounted and the path is stored as the environment variable AWS_WEB_IDENTITY_TOKEN_FILE. I can confirm that both the environment variable and mount exist when I describe the kubernetes pod. According to the docs the credential chain is meant to check if this tokens exists first. However I don't think that is happening from logs it looks like it is trying to hit the metadata endpoint which doesn't exist.

Expected Behavior

AWSDK should be able to access IAM Role for a service account within a pod

Current Behavior

My application logs have the following error

{
    "@t": "2019-09-26T07:29:04.2524656Z",
    "@m": "Error when receiving messages",
    "@i": "2d7a14a6",
    "@l": "Warning",
    "@x": "System.Net.Http.HttpRequestException: Response status code does not indicate success: 404 (Not Found).\n   at HttpResponseMessage System.Net.Http.HttpResponseMessage.EnsureSuccessStatusCode()\n   at async Task<string> System.Net.Http.HttpClient.GetStringAsyncCore(Task<HttpResponseMessage> getTask)\n   at void Amazon.Runtime.Internal.Util.AsyncHelpers+<>c__DisplayClass1_1<T>+<<RunSync>b__0>d.MoveNext()\n   at void Amazon.Runtime.Internal.Util.AsyncHelpers+ExclusiveSynchronizationContext.BeginMessageLoop()\n   at T Amazon.Runtime.Internal.Util.AsyncHelpers.RunSync<T>(Func<Task<T>> task)\n   at string Amazon.Util.AWSSDKUtils.DownloadStringContent(Uri uri, TimeSpan timeout, IWebProxy proxy)\n   at List<string> Amazon.Util.EC2InstanceMetadata.GetItems(string relativeOrAbsolutePath, int tries, bool slurp)\n   at IDictionary<string, IAMSecurityCredentialMetadata> Amazon.Util.EC2InstanceMetadata.get_IAMSecurityCredentials()\n   at ImmutableCredentials Amazon.Runtime.DefaultInstanceProfileAWSCredentials.FetchCredentials()\n   at ImmutableCredentials Amazon.Runtime.DefaultInstanceProfileAWSCredentials.GetCredentials()\n   at Task<ImmutableCredentials> Amazon.Runtime.DefaultInstanceProfileAWSCredentials.GetCredentialsAsync()\n   at async Task<T> Amazon.Runtime.Internal.CredentialsRetriever.InvokeAsync<T>(IExecutionContext executionContext)\n   at async Task<T> Amazon.Runtime.Internal.RetryHandler.InvokeAsync<T>(IExecutionContext executionContext)\n   at async Task<T> Amazon.Runtime.Internal.RetryHandler.InvokeAsync<T>(IExecutionContext executionContext)\n   at async Task<T> Amazon.Runtime.Internal.CallbackHandler.InvokeAsync<T>(IExecutionContext executionContext)\n   at async Task<T> Amazon.Runtime.Internal.CallbackHandler.InvokeAsync<T>(IExecutionContext executionContext)\n   at async Task<T> Amazon.Runtime.Internal.ErrorCallbackHandler.InvokeAsync<T>(IExecutionContext executionContext)\n   at async Task<T> Amazon.Runtime.Internal.MetricsHandler.InvokeAsync<T>(IExecutionContext executionContext)\n   at void Polly.CircuitBreaker.AsyncCircuitBreakerPolicy+<>c__DisplayClass8_0<TResult>+<<ImplementationAsync>b__0>d.MoveNext()\n   at async Task<TResult> Polly.CircuitBreaker.AsyncCircuitBreakerEngine.ImplementationAsync<TResult>(Func<Context, CancellationToken, Task<TResult>> action, Context context, CancellationToken cancellationToken, bool continueOnCapturedContext, ExceptionPredicates shouldHandleExceptionPredicates, ResultPredicates<TResult> shouldHandleResultPredicates, ICircuitController<TResult> breakerController)\n   at async Task<TResult> Polly.CircuitBreaker.AsyncCircuitBreakerPolicy.ImplementationAsync<TResult>(Func<Context, CancellationToken, Task<TResult>> action, Context context, CancellationToken cancellationToken, bool continueOnCapturedContext)\n   at async Task<TResult> Polly.AsyncPolicy.ExecuteAsync<TResult>(Func<Context, CancellationToken, Task<TResult>> action, Context context, CancellationToken cancellationToken, bool continueOnCapturedContext)\n   at void Polly.Wrap.AsyncPolicyWrapEngine+<>c__DisplayClass3_0<TResult>+<<ImplementationAsync>b__0>d.MoveNext()\n   at async Task<TResult> Polly.Retry.AsyncRetryEngine.ImplementationAsync<TResult>(Func<Context, CancellationToken, Task<TResult>> action, Context context, CancellationToken cancellationToken, ExceptionPredicates shouldRetryExceptionPredicates, ResultPredicates<TResult> shouldRetryResultPredicates, Func<DelegateResult<TResult>, TimeSpan, int, Context, Task> onRetryAsync, int permittedRetryCount, IEnumerable<TimeSpan> sleepDurationsEnumerable, Func<int, DelegateResult<TResult>, Context, TimeSpan> sleepDurationProvider, bool continueOnCapturedContext)\n   at async Task<TResult> Polly.AsyncPolicy.ExecuteAsync<TResult>(Func<Context, CancellationToken, Task<TResult>> action, Context context, CancellationToken cancellationToken, bool continueOnCapturedContext)\n   at async Task<TResult> Polly.Wrap.AsyncPolicyWrapEngine.ImplementationAsync<TResult>(Func<Context, CancellationToken, Task<TResult>> func, Context context, CancellationToken cancellationToken, bool continueOnCapturedContext, IAsyncPolicy outerPolicy, IAsyncPolicy innerPolicy)\n   at async Task<TResult> Polly.AsyncPolicy.ExecuteAsync<TResult>(Func<Context, CancellationToken, Task<TResult>> action, Context context, CancellationToken cancellationToken, bool continueOnCapturedContext)\n   at async Task MyApp.SqsReceiverService.ReceiveMessagesAsync(Uri queueUri, CancellationToken stoppingToken) in /workspace/src/MyApp/SqsReceiverService.cs:line 94",
    "EventId": {
        "Id": 6,
        "Name": "ErrorReceivingMessages"
    },
    "SourceContext": "MyApp.SqsReceiverService",
    "QueueUrl": "https://sqs.us-west-2.amazonaws.com/209652386498/build-nurture-myapp-Queue"
} 

From my istio proxy access logs, I can see it try the metadata endpoint but never STS

[2019-09-26T07:27:34.180Z] "GET /latest/meta-data/iam/security-credentials HTTP/1.1" 301 - "-" "-" 0 78 0 0 "-" "aws-sdk-dotnet-coreclr/ aws-sdk-dotnet-core/3.3.103.43 .NET_Core/4.6.28008.02 OS/Linux_4.14.138-114.102.amzn2.x86_64_#1_SMP_Thu_Aug_15_15:29:58_UTC_2019" "8c07624f-7f18-4573-bc02-2924fb0c9f87" "169.254.169.254" "169.254.169.254:80" outbound|80||metadata.local - 169.254.169.254:80 10.18.71.48:40337 -          
[2019-09-26T07:27:34.181Z] "GET /latest/meta-data/iam/security-credentials/ HTTP/1.1" 404 - "-" "-" 0 11 1 0 "-" "aws-sdk-dotnet-coreclr/ aws-sdk-dotnet-core/3.3.103.43 .NET_Core/4.6.28008.02 OS/Linux_4.14.138-114.102.amzn2.x86_64_#1_SMP_Thu_Aug_15_15:29:58_UTC_2019" "98e80d0a-80fa-447d-85ef-5f7fd54963aa" "169.254.169.254" "169.254.169.254:80" outbound|80||metadata.local - 169.254.169.254:80 10.18.71.48:40337 -

Your Environment

AWSSDK.SecretsManager: 3.3.101.29 AWSSDK.SQS: 3.3.102.11 AWSSDK.SecurityToken: 3.3.102.28

running in mcr.microsoft.com/dotnet/core/runtime:2.2-alpine3.9 on EKS 1.4

JonCubed commented 4 years ago

@klaytaybai I think this is actually a bug as document say this is supported from release 3.3.580.0

klaytaybai commented 4 years ago

Hi @JonCubed, can you please clarify what you mean when you say that a document is saying that it is supported? Relevant generated code was added then based on the model from the EKS service, but to get this added into the credentials chain is going to require some extra custom work. We're planning on adding this fairly soon. Based on the reactions to your post so far, it seems that this may be a rather high priority for many customers. If anybody wishes to add more support for prioritizing this work, please add a comment or an emoji reaction to the post.

JonCubed commented 4 years ago

@klaytaybai this page, details the minimum SDK versions that are meant to support this feature.

BEvgeniyS commented 4 years ago

He's referring to EKS documentation https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts-technical-overview.html

Supported versions of the AWS SDK look for these environment variables first in the credential chain provider. The role credentials are used for pods that meet this criteria.

"Supported SDKs" page (https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts-minimum-sdk.html) shows that .NET v.3.3.580.0+ is indeed supported

This statement appears to be false then?

klaytaybai commented 4 years ago

Technically that page is correct. The service functionality is now in the SDK, but the logic to support it as part of the Credentials Profile Chain is not. This latter feature is the cause of the error above and a feature request that we plan to develop.

JonCubed commented 4 years ago

@klaytaybai thanks for clarifying that the Credentials Profile Chain is not currently supported, it is an important feature for my team.

I do think that this points to an issue with documentation then as @BEvgeniyS rightly points out . On the IAM Roles for Service Accounts Technical Overview page it mentions that supported versions of the AWS SDK support it in the credential chain provider but the Using a Supported AWS SDK page states that from version v.3.3.580.0 is supported. I would expect that .NET would not be listed as supported until the credential profile chain supported is completed for this.

klaytaybai commented 4 years ago

Thanks. I agree that it needs to be clarified more in those docs until it is fully supported

bliles commented 4 years ago

Glad to see this is on the roadmap, we can work around this issue using assume-role, but it definitely increases the complexity of EKS service-account to IAM over "it-just-works" which is what we are hoping for.

jqmichael commented 4 years ago

Please thumbs up this issue if this is needed for your application. This will help the prioritization on the AWS sdk team.

On Thu, Oct 3, 2019 at 11:07 AM Brandon Liles notifications@github.com wrote:

Glad to see this is on the roadmap, we can work around this issue using assume-role, but it definitely increases the complexity of EKS service-account to IAM over "it-just-works" which is what we are hoping for.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/aws/aws-sdk-net/issues/1413?email_source=notifications&email_token=ADMYWQUHGPUDMKK3DRCZSDDQMYYFDA5CNFSM4I2WUFRKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAJCJ5I#issuecomment-538060021, or mute the thread https://github.com/notifications/unsubscribe-auth/ADMYWQSGLKGNL76LPM7CRFDQMYYFDANCNFSM4I2WUFRA .

kingwill27 commented 4 years ago

@bliles could you expand on how you're working around this with assume-role? particularly wondering if and how you're handling any refreshing of the tokens for long-lived clients.

bliles commented 4 years ago

@kingwill27 I initially thought we would work around this limitation in code, but when we discussed it as a team we decided to roll back to kube2iam until this is resolved.

munnja001 commented 4 years ago

Any update on when a full support for AWS_WEB_IDENTITY_TOKEN_FILE in the credential provider might be added? If not, could you provide more details on which services need to be used to support this in the interim?

ghost commented 4 years ago

I just tried the latest pre-release. I'm getting a null argument exception now instead of the 500 HTTP error. Does that mean work is being done on this?

    <PackageReference Include="AWSSDK.Core" Version="3.3.103.66" />
    <PackageReference Include="AWSSDK.S3" Version="3.3.107.2" />
coryflucas commented 4 years ago

https://github.com/aws/aws-sdk-net/commit/bb5f9d3f2304a625718b3bd2a97912a9f08df518 adds support for this. Unfortunately in our experience its currently broken due to https://github.com/aws/aws-sdk-net/issues/1493.

klaytaybai commented 4 years ago

Now that #1493 is fixed, we'll close this issue in a few days unless anybody has more issues with it.

markrendle commented 4 years ago

Where is the documentation on using EKS IAM in .NET Core applications?

cc: @klaytaybai @normj

coryflucas commented 4 years ago

@markrendle the documentation for the Kubernetes setup is here: https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html. the documentation here: https://docs.aws.amazon.com/sdk-for-net/v2/developer-guide/net-dg-config-creds.html#creds-assign needs an update since the new credentials provider is in the chain. You should be able to just create a new SDK client and it work without specifying credentials.

JonCubed commented 4 years ago

Thanks @klaytaybai I can confirm this is working for us now