aws / aws-sdk-net

The official AWS SDK for .NET. For more information on the AWS SDK for .NET, see our web site:
http://aws.amazon.com/sdkfornet/
Apache License 2.0
2.07k stars 861 forks source link

Sporadic IndexOutOfRangeException when publishing to an AWS queue after updating to .NET 6 from .NET Core 3.1. #2075

Open jantranikianTAG opened 2 years ago

jantranikianTAG commented 2 years ago

Describe the bug

After migrating to .NET 6 we started seeing the error from time to time when publishing a message to our AWS queue using the AmazonSimpleNotificationServiceClient sdk. Could it be because we are calling this in a loop ? Didn't see any issues like this before the .NET update.

The request makes it to our queue but it fails trying to read the response.

Expected Behavior

Should be able to publish and get the response back successfully

Current Behavior

System.IndexOutOfRangeException: Index was outside the bounds of the array.     
at System.Net.Http.Headers.HttpHeaders.ReadStoreValues[T](Span`1 values, Object storeValue, HttpHeaderParser parser, Int32& currentIndex)
     at System.Net.Http.Headers.HttpHeaders.GetStoreValuesAsStringOrStringArray(HeaderDescriptor descriptor, Object sourceValues, String& singleValue, String[]& multiValue)
     at System.Net.Http.Headers.HttpHeaders.GetEnumeratorCore()+MoveNext()     at Amazon.Runtime.Internal.Transform.HttpClientResponseData.CopyHeaderValues(HttpResponseMessage response)
     at Amazon.Runtime.Internal.Transform.HttpClientResponseData..ctor(HttpResponseMessage response, HttpClient httpClient, Boolean disposeClient)
     at Amazon.Runtime.HttpWebRequestMessage.GetResponseAsync(CancellationToken cancellationToken)
     at Amazon.Runtime.Internal.HttpHandler`1.InvokeAsync[T](IExecutionContext executionContext)
     at Amazon.Runtime.Internal.Unmarshaller.InvokeAsync[T](IExecutionContext executionContext)
     at Amazon.Runtime.Internal.ErrorHandler.InvokeAsync[T](IExecutionContext executionContext)
     at Amazon.Runtime.Internal.ErrorHandler.InvokeAsync[T](IExecutionContext executionContext)
     at Amazon.Runtime.Internal.CallbackHandler.InvokeAsync[T](IExecutionContext executionContext)
     at Amazon.Runtime.Internal.EndpointDiscoveryHandler.InvokeAsync[T](IExecutionContext executionContext)
     at Amazon.Runtime.Internal.EndpointDiscoveryHandler.InvokeAsync[T](IExecutionContext executionContext)
     at Amazon.Runtime.Internal.CredentialsRetriever.InvokeAsync[T](IExecutionContext executionContext)
     at Amazon.Runtime.Internal.RetryHandler.InvokeAsync[T](IExecutionContext executionContext)
     at Amazon.Runtime.Internal.RetryHandler.InvokeAsync[T](IExecutionContext executionContext)
     at Amazon.Runtime.Internal.CallbackHandler.InvokeAsync[T](IExecutionContext executionContext)
     at Amazon.Runtime.Internal.CallbackHandler.InvokeAsync[T](IExecutionContext executionContext)
     at Amazon.Runtime.Internal.ErrorCallbackHandler.InvokeAsync[T](IExecutionContext executionContext)
     at Amazon.Runtime.Internal.MetricsHandler.InvokeAsync[T](IExecutionContext executionContext)

Reproduction Steps

foreach (var request in requests)
{
      var response = await _snsClient.PublishAsync(request);
}

Possible Solution

No response

Additional Information/Context

No response

AWS .NET SDK and/or Package version used

<PackageReference Include="AWSSDK.Core" Version="3.7.12.5" />
<PackageReference Include="AWSSDK.Extensions.NETCore.Setup" Version="3.7.2" />
<PackageReference Include="AWSSDK.SimpleNotificationService" Version="3.7.3.77" />
<PackageReference Include="AWSSDK.SQS" Version="3.7.2.74" />

Targeted .NET Platform

.NET 6

Operating System and version

Windows 10

ashishdhingra commented 2 years ago

Could be related to https://github.com/aws/aws-sdk-net/issues/1940, a known issue in .NET 6 where System.Net.Http.Headers.HttpHeaders collection is no longer thread safe. Issue https://github.com/dotnet/runtime/issues/61798 tracked it. PR https://github.com/dotnet/runtime/pull/68115 was merged to fix it, but it mentions that the fix would be in preview 5 of .NET 7, mentions no plans to be back ported to .NET 6.

May be something could be implemented in the SDK to make reading HttpHeaders thread safe if the fix is not yet ported in .NET 6.

@jantranikianTAG What is the .NET 6 version you are using? Could you try upgrading to latest .NET 6 version and see if it resolves the issue?

jantranikianTAG commented 2 years ago

@ashishdhingra thanks for your response. Looked at the PR you mentioned and it seems like it should fix the issue. I'll make sure we are on the latest version of .NET 6, but from what I've seen it looks like .NET 7 should fix this. So we'll probably have to wait for that unless there is a patch release for .NET 6.

In the meantime we can look for and catch this specific error since I know the stack trace always ends with:

System.IndexOutOfRangeException: Index was outside the bounds of the array. at System.Net.Http.Headers.HttpHeaders.ReadStoreValues

Not ideal but this error occurs when trying to read the HTTP response so I know the request made it to the queue. In my case I don't really need any information from the response. But might be worth exploring if there is a way to handle this on the aws-sdk side ? Other developers might experience similar issues. Just a though. Again, appreciate your response.

ashishdhingra commented 2 years ago

@ashishdhingra thanks for your response. Looked at the PR you mentioned and it seems like it should fix the issue. I'll make sure we are on the latest version of .NET 6, but from what I've seen it looks like .NET 7 should fix this. So we'll probably have to wait for that unless there is a patch release for .NET 6.

In the meantime we can look for and catch this specific error since I know the stack trace always ends with:

System.IndexOutOfRangeException: Index was outside the bounds of the array. at System.Net.Http.Headers.HttpHeaders.ReadStoreValues

Not ideal but this error occurs when trying to read the HTTP response so I know the request made it to the queue. In my case I don't really need any information from the response. But might be worth exploring if there is a way to handle this on the aws-sdk side ? Other developers might experience similar issues. Just a though. Again, appreciate your response.

@jantranikianTAG Thanks for your response. This needs to be reviewed with the team for the workaround.

jantranikianTAG commented 2 years ago

@ashishdhingra

Just saw another another error related to ReadStoreValues. However this time it was a null reference exception instead of an IndexOutOfRangeException exception. In the earlier example I didn't need the response for anything, but for this I need to check the status code on the response. Any ideas on how I should proceed ? We are pretty blocked on this right now. Thanks

We are calling the GetTopicAttributesAsync method on the AmazonSimpleNotificationServiceClient sdk and we see the error below.

var response = await _sns.GetTopicAttributesAsync(_snsTopicArn);
var isSuccess = response.HttpStatusCode == HttpStatusCode.OK;
Object reference not set to an instance of an object. 
Stack Trace: 
   at System.Net.Http.Headers.HttpHeaders.ReadStoreValues[T](Span`1 values, Object storeValue, HttpHeaderParser parser, Int32& currentIndex)
   at System.Net.Http.Headers.HttpHeaders.GetStoreValuesAsStringOrStringArray(HeaderDescriptor descriptor, Object sourceValues, String& singleValue, String[]& multiValue)
   at Amazon.Runtime.Internal.Transform.HttpClientResponseData.CopyHeaderValues(HttpResponseMessage response)
   at Amazon.Runtime.Internal.Transform.HttpClientResponseData..ctor(HttpResponseMessage response, HttpClient httpClient, Boolean disposeClient)
   at Amazon.Runtime.HttpWebRequestMessage.GetResponseAsync(CancellationToken cancellationToken)
   at Amazon.Runtime.Internal.HttpHandler`1.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.Unmarshaller.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.ErrorHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.ErrorHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.CallbackHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.EndpointDiscoveryHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.EndpointDiscoveryHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.CredentialsRetriever.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.RetryHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.RetryHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.CallbackHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.CallbackHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.ErrorCallbackHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.MetricsHandler.InvokeAsync[T](IExecutionContext executionContext)
jantranikianTAG commented 2 years ago

@ashishdhingra Any update on this. We have a workaround in place so we aren't blocked by anything now. But it would be nice to remove the work around since it is a temporary fix.

RandyCui commented 2 years ago

@jantranikianTAG Could you tell me your workaround? thanks!

jantranikianTAG commented 2 years ago

@RandyCui Here is my workaround. I basically check if the error originated in the ReadStoreValues method.

private static bool IsReadStoreValuesError(Exception e)
{
    return e.StackTrace?.Contains("at System.Net.Http.Headers.HttpHeaders.ReadStoreValues") ?? false;
}

  public async Task PushToSnsTopic(object payload)
  {
      var request = new PublishRequest
      {
          Message = JsonConvert.SerializeObject(payload),
          TopicArn = _snsTopicArn
      };
      try
      {
          await _sns.PublishAsync(request);
      }
      catch (IndexOutOfRangeException e)
      {
          var containsHttpResponseError = IsReadStoreValuesError(e);

          if (!containsHttpResponseError)
          {
              throw;
          }
      }
      catch (NullReferenceException e)
      {
          var containsHttpResponseError = IsReadStoreValuesError(e);

          if (!containsHttpResponseError)
          {
              throw;
          }
      }
  }
ashishdhingra commented 2 years ago

@jantranikianTAG Somehow I'm unable to reproduce the issue using the below minimal code:

using Amazon.SimpleNotificationService;
using Amazon.SimpleNotificationService.Model;

string topicArn = "<<set Topic ARN here>>";
string message = "Testing SNS #";
var _snsClient = new AmazonSimpleNotificationServiceClient();

List<PublishRequest> requests = new List<PublishRequest>() {
    new PublishRequest(){ TopicArn = topicArn, Message = message + "1" },
    new PublishRequest(){ TopicArn = topicArn, Message = message + "2" },
    new PublishRequest(){ TopicArn = topicArn, Message = message + "3" },
    new PublishRequest(){ TopicArn = topicArn, Message = message + "4" },
    new PublishRequest(){ TopicArn = topicArn, Message = message + "5" },
    new PublishRequest(){ TopicArn = topicArn, Message = message + "6" },
    new PublishRequest(){ TopicArn = topicArn, Message = message + "7" },
    new PublishRequest(){ TopicArn = topicArn, Message = message + "8" },
    new PublishRequest(){ TopicArn = topicArn, Message = message + "9" }
};

List<Task<PublishResponse>> tasks = new List<Task<PublishResponse>>();
foreach (var request in requests)
{
    tasks.Add(_snsClient.PublishAsync(request));
}

Task.WaitAll(tasks.ToArray());

Console.ReadLine();

Is this still an issue? Does it occur consistently?

Thanks, Ashish

github-actions[bot] commented 2 years ago

This issue has not received a response in 5 days. If you want to keep this issue open, please just leave a comment below and auto-close will be canceled.

jantranikianTAG commented 2 years ago

Hmm. Yeah we are still seeing it. By nature it is not 100% reproducible. But we get a handful of these a day. I think the test you have looks okay.

Since ours is a web application we have multiple users potentially hitting the same queue at once. Maybe take what you have and wrap it in a controller and call that controller a few different times.

ik130 commented 2 years ago

Can you specify which specific version of .NET 6 are you using?

luiscnsousa commented 2 years ago

For anyone suddenly experiencing new errors on AWS SDK when retargeting an application from .NET Core 3.1 to .NET 6, check if you're using an older version of the NewRelic agent on your application.

I had a similar issue with a container based .NET 6 Web API that started throwing the following exceptions as soon as it was retargeted to the newer version of the framework:

In my case the application was throwing these exceptions when calling PutObjectAsync of IAmazonS3, but I've seen similar reports on other AWS services as well (SQS, DynamoDB, etc), the stack trace is always identical.

The docker image for my application was changed into this:

ARG newrelicversion=10.2.0

# Install New Relic .NET agent
RUN apt-get update && apt-get install -y wget ca-certificates gnupg \
&& echo 'deb http://apt.newrelic.com/debian/ newrelic non-free' | tee /etc/apt/sources.list.d/newrelic.list \
&& wget https://download.newrelic.com/548C16BF.gpg \
&& apt-key add 548C16BF.gpg \
&& apt-get update \
&& apt-get install -y newrelic-dotnet-agent=$newrelicversion \
&& rm -rf /var/lib/apt/lists/*

# Enable New Relic .NET agent
ENV CORECLR_ENABLE_PROFILING=1 \
CORECLR_PROFILER={36032161-FFC0-4B61-B559-F6C5D41BAE5A} \
CORECLR_NEWRELIC_HOME=/usr/local/newrelic-dotnet-agent \
CORECLR_PROFILER_PATH=/usr/local/newrelic-dotnet-agent/libNewRelicProfiler.so \
NEW_RELIC_LICENSE_KEY=FEED_ME_FROM_CD \
NEW_RELIC_APP_NAME=FEED_ME_FROM_CDa

Previously it was using version 8.37.0 instead of the latest (at the time being) 10.2.0, and the name of the package being used was the older newrelic-netcore20-agent instead of the more recent newrelic-dotnet-agent. I imagine that somewhere in between version 8.37.0 and 10.2.0 this issue was fixed.

It is particularly tricky to pinpoint the New Relic agent as the root cause of the problem because the integration with New Relic is not affected in any way. The impact of running a .NET 6 application with an older New Relic agent could only be seen on HTTP calls to AWS in the form of exceptions, and even those point to the dotnet runtime, so there's a lot of indirection here.

jantranikianTAG commented 2 years ago

Can you specify which specific version of .NET 6 are you using?

6.0.6

github-actions[bot] commented 2 years ago

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.

nguyentoantuit commented 1 year ago

@luiscnsousa We have the same issue, do you have any workaround solution? Or does it automatically resolved if we upgrade the renewlic to the latest version?

Xriuk commented 9 months ago

Just adding on this that we are receiving a similar exception coming from AmazonSecurityTokenServiceClient, but this time the exception is:

System.InvalidOperationException: Collection was modified; enumeration operation may not execute.
   at System.Collections.Generic.List`1.Enumerator.MoveNext()
   at System.Net.Http.Headers.HttpHeaders.ReadStoreValues[T](Span`1 values, Object storeValue, HttpHeaderParser parser, Int32& currentIndex)
   at System.Net.Http.Headers.HttpHeaders.GetStoreValuesAsStringOrStringArray(HeaderDescriptor descriptor, Object sourceValues, String& singleValue, String[]& multiValue)
   at System.Net.Http.Headers.HttpHeaders.GetStoreValuesAsStringArray(HeaderDescriptor descriptor, HeaderStoreItemInfo info)
   at System.Net.Http.Headers.HttpHeaders.TryGetValues(HeaderDescriptor descriptor, IEnumerable`1& values)
   at System.Net.Http.HttpConnectionPool.EstablishProxyTunnelAsync(Boolean async, HttpRequestHeaders headers, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.ConnectAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.CreateHttp11ConnectionAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.AddHttp11ConnectionAsync(HttpRequestMessage request)
   at System.Threading.Tasks.TaskCompletionSourceWithCancellation`1.WaitWithCancellationAsync(CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.GetHttp11ConnectionAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpClient.<SendAsync>g__Core|83_0(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationTokenSource cts, Boolean disposeCts, CancellationTokenSource pendingRequestsCts, CancellationToken originalCancellationToken)
   at Amazon.Runtime.HttpWebRequestMessage.GetResponseAsync(CancellationToken cancellationToken)
   at Amazon.Runtime.Internal.HttpHandler`1.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.Unmarshaller.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.ErrorHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.ErrorHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.CallbackHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.Signer.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.EndpointDiscoveryHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.EndpointDiscoveryHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.CredentialsRetriever.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.RetryHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.RetryHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.CallbackHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.CallbackHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.ErrorCallbackHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.MetricsHandler.InvokeAsync[T](IExecutionContext executionContext)
ashishdhingra commented 3 weeks ago

@nguyentoantuit Good afternoon. Please advise if you are still getting the IndexOutOfRangeException with the latest AWS SDK and .NET version. As mentioned in https://github.com/aws/aws-sdk-net/issues/2075#issuecomment-1182595473. I'm unsure if PR https://github.com/dotnet/runtime/pull/68115 was back-ported to .NET 6 though.