AzureAD / microsoft-identity-web

Helps creating protected web apps and web APIs with Microsoft identity platform and Azure AD B2C
MIT License
676 stars 209 forks source link

Critical: InvalidOperationException: Operations that change non-concurrent collections must have exclusive access. A concurrent update was performed on this collection and corrupted its state. The collection's state is no longer correct #2485

Open AndreErb opened 1 year ago

AndreErb commented 1 year ago

Microsoft.Identity.Web Library

Microsoft.Identity.Web

Microsoft.Identity.Web version

2.13.4

Web app

Sign-in users and call web APIs

Web API

Protected web APIs call downstream web APIs

Token cache serialization

In-memory caches

Description

The DI setup of "Microsoft.Identity.Web" seems to have multiple / different race-condition problems, which show up in different flavours (different kind of Exceptions).

This is critical as it occurs on a regular basis, and the APIs are non-functional until being restarted actively (or by sleep/awake).

There are several issues on Github which are or seem to be related to this problem in general (APIs and Web Apps), however they are closed (as fixed) or with no reaction on further questions/comments:

Open: https://github.com/AzureAD/microsoft-identity-web/issues/2328 https://github.com/AzureAD/microsoft-identity-web/issues/2456

Closed: https://github.com/AzureAD/microsoft-identity-web/issues/1215 https://github.com/AzureAD/microsoft-identity-web/issues/1995 https://github.com/Azure-Samples/active-directory-aspnetcore-webapp-openidconnect-v2/issues/664 https://github.com/AzureAD/microsoft-identity-web/issues/1957 https://github.com/AzureAD/microsoft-identity-web/issues/2456 https://github.com/microsoftgraph/msgraph-sdk-java/issues/977

Sometimes, the bug is referred as a regression in .NET 7. It's also being said that "..I believe the next runtime patch of .NET 7 should fix it.". It remains unclear if it should be fixed by the "Runtime Installation of .NET 7" or by a "NuGet Package" (e.g. Microsoft.Extensions.Options 7.0.1 ???) as it seems like "Microsoft.Identity.Web V2.13.4" is using "Microsoft.Extensions.Options V7.0.0", under the hood. Additionally, there seems to be a Fix for Identity.Web, introducing the usage of a "ConcurrentDictionary".

However, the log shows, that on the "PostConfigure"-Event "Microsoft.Identity.Web.MergedOptions.UpdateMergedOptionsFromMicrosoftIdentityOptions" calls Add() on some ordinary "Hashset" (no concurrency here?).

System.InvalidOperationException: Operations that change non-concurrent collections must have exclusive access. A concurrent update was performed on this collection and corrupted its state. The collection's state is no longer correct.
   at System.Collections.Generic.HashSet`1.AddIfNotPresent(T value, Int32& location)
   at System.Collections.Generic.HashSet`1.System.Collections.Generic.ICollection<T>.Add(T item)
   at Microsoft.Identity.Web.MergedOptions.UpdateMergedOptionsFromMicrosoftIdentityOptions(MicrosoftIdentityOptions microsoftIdentityOptions, MergedOptions mergedOptions)
   at Microsoft.Identity.Web.MicrosoftIdentityOptionsMerger.PostConfigure(String name, MicrosoftIdentityOptions options)

I am also wondering, if it is really enough to make the Getting/Adding of the MergedOptions thread-safe. Can't there be a race-condition when subscribing to the events, too? Several other issues, seem to see multiple subscriptions, which lead to errors like "The authorization code has already reedemed..". Anyway, I am not sure on this.

Reproduction steps

.NET 6 API

Our APIs are hosted in IIS and will sleep after a time of no action. If multiple calls arrive simultaneously at the API, it is being started and usual setup (DI) takes place.

Error message

2023-09-28T08:14:46.6515467+02:00 [Thread:] [ERR] (Microsoft.AspNetCore.Server.IIS.Core.IISHttpServer) Connection ID ""15420333921820215259"", Request ID ""800007dc-0800-d600-b63f-84710c7967bb"": An unhandled exception was thrown by the application.
System.InvalidOperationException: Operations that change non-concurrent collections must have exclusive access. A concurrent update was performed on this collection and corrupted its state. The collection's state is no longer correct.
   at System.Collections.Generic.HashSet`1.AddIfNotPresent(T value, Int32& location)
   at System.Collections.Generic.HashSet`1.System.Collections.Generic.ICollection<T>.Add(T item)
   at Microsoft.Identity.Web.MergedOptions.UpdateMergedOptionsFromMicrosoftIdentityOptions(MicrosoftIdentityOptions microsoftIdentityOptions, MergedOptions mergedOptions)
   at Microsoft.Identity.Web.MicrosoftIdentityOptionsMerger.PostConfigure(String name, MicrosoftIdentityOptions options)
   at Microsoft.Extensions.Options.OptionsFactory`1.Create(String name)
   at Microsoft.Extensions.Options.OptionsMonitor`1.<>c.<Get>b__10_0(String name, IOptionsFactory`1 factory)
   at Microsoft.Extensions.Options.OptionsCache`1.<>c__3`1.<GetOrAdd>b__3_0(String name, ValueTuple`2 arg)
   at System.Collections.Concurrent.ConcurrentDictionary`2.GetOrAdd[TArg](TKey key, Func`3 valueFactory, TArg factoryArgument)
   at Microsoft.Extensions.Options.OptionsCache`1.GetOrAdd[TArg](String name, Func`3 createOptions, TArg factoryArgument)
   at Microsoft.Extensions.Options.OptionsMonitor`1.Get(String name)
   at Microsoft.Identity.Web.MicrosoftIdentityWebApiAuthenticationBuilderExtensions.<>c__DisplayClass3_0.<AddMicrosoftIdentityWebApiImplementation>b__2(JwtBearerOptions options, IServiceProvider serviceProvider, IMergedOptionsStore mergedOptionsMonitor, IOptionsMonitor`1 msIdOptionsMonitor)
   at Microsoft.Extensions.Options.ConfigureNamedOptions`4.Configure(String name, TOptions options)
   at Microsoft.Extensions.Options.OptionsFactory`1.Create(String name)
   at Microsoft.Extensions.Options.OptionsMonitor`1.<>c.<Get>b__10_0(String name, IOptionsFactory`1 factory)
   at Microsoft.Extensions.Options.OptionsCache`1.<>c__3`1.<GetOrAdd>b__3_0(String name, ValueTuple`2 arg)
   at System.Collections.Concurrent.ConcurrentDictionary`2.GetOrAdd[TArg](TKey key, Func`3 valueFactory, TArg factoryArgument)
   at Microsoft.Extensions.Options.OptionsCache`1.GetOrAdd[TArg](String name, Func`3 createOptions, TArg factoryArgument)
   at Microsoft.Extensions.Options.OptionsMonitor`1.Get(String name)
   at Microsoft.AspNetCore.Authentication.AuthenticationHandler`1.InitializeAsync(AuthenticationScheme scheme, HttpContext context)
   at Microsoft.AspNetCore.Authentication.AuthenticationHandlerProvider.GetHandlerAsync(HttpContext context, String authenticationScheme)
   at Microsoft.AspNetCore.Authentication.AuthenticationService.AuthenticateAsync(HttpContext context, String scheme)
   at Microsoft.AspNetCore.Authentication.AuthenticationMiddleware.Invoke(HttpContext context)
   at Microsoft.AspNetCore.Routing.EndpointRoutingMiddleware.<Invoke>g__AwaitMatcher|8_0(EndpointRoutingMiddleware middleware, HttpContext httpContext, Task`1 matcherTask)
   at Swashbuckle.AspNetCore.SwaggerUI.SwaggerUIMiddleware.Invoke(HttpContext httpContext)
   at Swashbuckle.AspNetCore.Swagger.SwaggerMiddleware.Invoke(HttpContext httpContext, ISwaggerProvider swaggerProvider)
   at Microsoft.AspNetCore.Server.IIS.Core.IISHttpContextOfT`1.ProcessRequestAsync()

Id Web logs

No response

Relevant code snippets

`services.AddMicrosoftIdentityWebApiAuthentication(builder.Configuration);`

Regression

No response

Expected behavior

The API should start and Microsoft.Identity.Web setup should be done without any Race Condition or Multithreading problems.

jmprieur commented 1 year ago

Thanks @AndreErb for taking the time to look at previous issues.

There are several issues.

  1. There was a multi-threading issue in .NET 7.0.0 runtime, which was fixed in .NET 7.0.1 runtime in February. This is https://github.com/AzureAD/microsoft-identity-web/issues/1995
  2. AddMicrosoftIdentityWebApp has several overloads. The one with a delegate had an issue that the delegate was called twice. I believe that we just fixed it: https://github.com/AzureAD/microsoft-identity-web/pull/2489. I would think that this will fix your BlazorServer app problem. If it does not we'd need to see the initialization code.
  3. You seem to also issues with web APIs under stress. We have been running stress tests on Microsoft.Identity.Web web APIs for a few hours but still don't see the issue. How do you call MicrosoftIdentityWebApi?
AndreErb commented 1 year ago

@jmprieur

On 2.)

  1. AddMicrosoftIdentityWebApp has several overloads. The one with a delegate had an issue that the delegate was called twice. I believe that we just fixed it: https://github.com/AzureAD/microsoft-identity-web/pull/2489. I would think that this will fix your BlazorServer app problem. If it does not we'd need to see the initialization code.

This hasn't been released, yet? Correct? Or which version of "Microsoft.Identity.Web" should we try, in order to test if it fixes the problem?

Just in case, the initialization code of our Blazor Apps helps you anyway. Here it is:

var initialScopes = builder.Configuration["MicrosoftGraph:Scopes"]?.Split(' ');
services.AddAuthentication(OpenIdConnectDefaults.AuthenticationScheme)
    .AddMicrosoftIdentityWebApp(builder.Configuration.GetSection("AzureAd"))
    .EnableTokenAcquisitionToCallDownstreamApi(initialScopes)
    .AddMicrosoftGraph(builder.Configuration.GetSection("MicrosoftGraph"))
    .AddDownstreamApi(KobaService.KobaApiService, builder.Configuration.GetSection("KobaApi"))
    .AddInMemoryTokenCaches();

On 3.)

  1. You seem to also issues with web APIs under stress. We have been running stress tests on Microsoft.Identity.Web web APIs for a few hours but still don't see the issue. How do you call MicrosoftIdentityWebApi?

No I don't have the feeling, that it happens under heavy load (under stress). I am pretty sure, it happens when the API is woken up, by several callers (2+), at the same time. Our APIs are hosted in IIS and therefore they are shut down on the "Idle Timeout" (default 20 Minutes) or after "Recycle".

Could it make sense to re-open #1957 ?

Here's a log, which shows a typical symptom. The API goes to sleep at 08:10 and is being called/woken up at 08:14. The InvalidOperationException shows up 2 times, milliseconds away from each other

2023-09-28T08:10:08.2103100+02:00 [Thread:] [INF] (Microsoft.Hosting.Lifetime) Application is shutting down...
2023-09-28T08:14:46.1206990+02:00 [Thread:] [WRN] (Microsoft.AspNetCore.DataProtection.Repositories.EphemeralXmlRepository) Using an in-memory repository. Keys will not be persisted to storage.
2023-09-28T08:14:46.1999458+02:00 [Thread:] [WRN] (Microsoft.AspNetCore.DataProtection.KeyManagement.XmlKeyManager) Neither user profile nor HKLM registry available. Using an ephemeral key repository. Protected data will be unavailable when application exits.
2023-09-28T08:14:46.2192273+02:00 [Thread:] [WRN] (Microsoft.AspNetCore.DataProtection.KeyManagement.XmlKeyManager) No XML encryptor configured. Key {9c83a77f-bcd0-4b75-98ca-b5fe550e228a} may be persisted to storage in unencrypted form.
2023-09-28T08:14:46.4853706+02:00 [Thread:] [INF] (Microsoft.Hosting.Lifetime) Application started. Press Ctrl+C to shut down.
2023-09-28T08:14:46.4857979+02:00 [Thread:] [INF] (Microsoft.Hosting.Lifetime) Hosting environment: "Production"
2023-09-28T08:14:46.4859037+02:00 [Thread:] [INF] (Microsoft.Hosting.Lifetime) Content root path: "C:\Apis\CustomerApi\"
2023-09-28T08:14:46.6515515+02:00 [Thread:] [ERR] (Microsoft.AspNetCore.Server.IIS.Core.IISHttpServer) Connection ID ""8502804919948935322"", Request ID ""8000009b-0806-7600-b63f-84710c7967bb"": An unhandled exception was thrown by the application.
System.InvalidOperationException: Operations that change non-concurrent collections must have exclusive access. A concurrent update was performed on this collection and corrupted its state. The collection's state is no longer correct.
   at System.Collections.Generic.HashSet`1.AddIfNotPresent(T value, Int32& location)
   at System.Collections.Generic.HashSet`1.System.Collections.Generic.ICollection<T>.Add(T item)
   at Microsoft.Identity.Web.MergedOptions.UpdateMergedOptionsFromMicrosoftIdentityOptions(MicrosoftIdentityOptions microsoftIdentityOptions, MergedOptions mergedOptions)
   at Microsoft.Identity.Web.MicrosoftIdentityOptionsMerger.PostConfigure(String name, MicrosoftIdentityOptions options)
   at Microsoft.Extensions.Options.OptionsFactory`1.Create(String name)
   at Microsoft.Extensions.Options.OptionsMonitor`1.<>c.<Get>b__10_0(String name, IOptionsFactory`1 factory)
   at Microsoft.Extensions.Options.OptionsCache`1.<>c__3`1.<GetOrAdd>b__3_0(String name, ValueTuple`2 arg)
   at System.Collections.Concurrent.ConcurrentDictionary`2.GetOrAdd[TArg](TKey key, Func`3 valueFactory, TArg factoryArgument)
   at Microsoft.Extensions.Options.OptionsCache`1.GetOrAdd[TArg](String name, Func`3 createOptions, TArg factoryArgument)
   at Microsoft.Extensions.Options.OptionsMonitor`1.Get(String name)
   at Microsoft.Identity.Web.MicrosoftIdentityWebApiAuthenticationBuilderExtensions.<>c__DisplayClass3_0.<AddMicrosoftIdentityWebApiImplementation>b__2(JwtBearerOptions options, IServiceProvider serviceProvider, IMergedOptionsStore mergedOptionsMonitor, IOptionsMonitor`1 msIdOptionsMonitor)
   at Microsoft.Extensions.Options.ConfigureNamedOptions`4.Configure(String name, TOptions options)
   at Microsoft.Extensions.Options.OptionsFactory`1.Create(String name)
   at Microsoft.Extensions.Options.OptionsMonitor`1.<>c.<Get>b__10_0(String name, IOptionsFactory`1 factory)
   at Microsoft.Extensions.Options.OptionsCache`1.<>c__3`1.<GetOrAdd>b__3_0(String name, ValueTuple`2 arg)
   at System.Collections.Concurrent.ConcurrentDictionary`2.GetOrAdd[TArg](TKey key, Func`3 valueFactory, TArg factoryArgument)
   at Microsoft.Extensions.Options.OptionsCache`1.GetOrAdd[TArg](String name, Func`3 createOptions, TArg factoryArgument)
   at Microsoft.Extensions.Options.OptionsMonitor`1.Get(String name)
   at Microsoft.AspNetCore.Authentication.AuthenticationHandler`1.InitializeAsync(AuthenticationScheme scheme, HttpContext context)
   at Microsoft.AspNetCore.Authentication.AuthenticationHandlerProvider.GetHandlerAsync(HttpContext context, String authenticationScheme)
   at Microsoft.AspNetCore.Authentication.AuthenticationService.AuthenticateAsync(HttpContext context, String scheme)
   at Microsoft.AspNetCore.Authentication.AuthenticationMiddleware.Invoke(HttpContext context)
   at Swashbuckle.AspNetCore.SwaggerUI.SwaggerUIMiddleware.Invoke(HttpContext httpContext)
   at Swashbuckle.AspNetCore.Swagger.SwaggerMiddleware.Invoke(HttpContext httpContext, ISwaggerProvider swaggerProvider)
   at Microsoft.AspNetCore.Server.IIS.Core.IISHttpContextOfT`1.ProcessRequestAsync()
2023-09-28T08:14:46.6515479+02:00 [Thread:] [ERR] (Microsoft.AspNetCore.Server.IIS.Core.IISHttpServer) Connection ID ""9223380860328214851"", Request ID ""80000144-0806-8000-b63f-84710c7967bb"": An unhandled exception was thrown by the application.
System.InvalidOperationException: Operations that change non-concurrent collections must have exclusive access. A concurrent update was performed on this collection and corrupted its state. The collection's state is no longer correct.
   at System.Collections.Generic.HashSet`1.AddIfNotPresent(T value, Int32& location)
   at System.Collections.Generic.HashSet`1.System.Collections.Generic.ICollection<T>.Add(T item)
   at Microsoft.Identity.Web.MergedOptions.UpdateMergedOptionsFromMicrosoftIdentityOptions(MicrosoftIdentityOptions microsoftIdentityOptions, MergedOptions mergedOptions)
   at Microsoft.Identity.Web.MicrosoftIdentityOptionsMerger.PostConfigure(String name, MicrosoftIdentityOptions options)
   at Microsoft.Extensions.Options.OptionsFactory`1.Create(String name)
   at Microsoft.Extensions.Options.OptionsMonitor`1.<>c.<Get>b__10_0(String name, IOptionsFactory`1 factory)
   at Microsoft.Extensions.Options.OptionsCache`1.<>c__3`1.<GetOrAdd>b__3_0(String name, ValueTuple`2 arg)
   at System.Collections.Concurrent.ConcurrentDictionary`2.GetOrAdd[TArg](TKey key, Func`3 valueFactory, TArg factoryArgument)
   at Microsoft.Extensions.Options.OptionsCache`1.GetOrAdd[TArg](String name, Func`3 createOptions, TArg factoryArgument)
   at Microsoft.Extensions.Options.OptionsMonitor`1.Get(String name)
   at Microsoft.Identity.Web.MicrosoftIdentityWebApiAuthenticationBuilderExtensions.<>c__DisplayClass3_0.<AddMicrosoftIdentityWebApiImplementation>b__2(JwtBearerOptions options, IServiceProvider serviceProvider, IMergedOptionsStore mergedOptionsMonitor, IOptionsMonitor`1 msIdOptionsMonitor)
   at Microsoft.Extensions.Options.ConfigureNamedOptions`4.Configure(String name, TOptions options)
   at Microsoft.Extensions.Options.OptionsFactory`1.Create(String name)
   at Microsoft.Extensions.Options.OptionsMonitor`1.<>c.<Get>b__10_0(String name, IOptionsFactory`1 factory)
   at Microsoft.Extensions.Options.OptionsCache`1.<>c__3`1.<GetOrAdd>b__3_0(String name, ValueTuple`2 arg)
   at System.Collections.Concurrent.ConcurrentDictionary`2.GetOrAdd[TArg](TKey key, Func`3 valueFactory, TArg factoryArgument)
   at Microsoft.Extensions.Options.OptionsCache`1.GetOrAdd[TArg](String name, Func`3 createOptions, TArg factoryArgument)
   at Microsoft.Extensions.Options.OptionsMonitor`1.Get(String name)
   at Microsoft.AspNetCore.Authentication.AuthenticationHandler`1.InitializeAsync(AuthenticationScheme scheme, HttpContext context)
   at Microsoft.AspNetCore.Authentication.AuthenticationHandlerProvider.GetHandlerAsync(HttpContext context, String authenticationScheme)
   at Microsoft.AspNetCore.Authentication.AuthenticationService.AuthenticateAsync(HttpContext context, String scheme)
   at Microsoft.AspNetCore.Authentication.AuthenticationMiddleware.Invoke(HttpContext context)
   at Microsoft.AspNetCore.Routing.EndpointRoutingMiddleware.<Invoke>g__AwaitMatcher|8_0(EndpointRoutingMiddleware middleware, HttpContext httpContext, Task`1 matcherTask)
   at Swashbuckle.AspNetCore.SwaggerUI.SwaggerUIMiddleware.Invoke(HttpContext httpContext)
   at Swashbuckle.AspNetCore.Swagger.SwaggerMiddleware.Invoke(HttpContext httpContext, ISwaggerProvider swaggerProvider)
   at Microsoft.AspNetCore.Server.IIS.Core.IISHttpContextOfT`1.ProcessRequestAsync()
jmprieur commented 1 year ago

@AndreErb do you have a minimal repro project you could share with us?

AndreErb commented 1 year ago

@AndreErb do you have a minimal repro project you could share with us?

@jmprieur I don't really have a minimal one. I'd have to remove all of the logic code and I am afraid we'd end up in something, mainly containing the "Identity.Web" initialization code, as posted above.

I could create one. It could make sense for the API as the log above shows, that somehow the Swagger/Swashbuckle stuff (somehow) seems to be involved into the Exception, too. However, I would not be able to tell, if the minimal project would be able to reproduce the problem, as it occurs only in production, when hosted in IIS and being used by multiple users.

Additionally. Did you see my question about the the App-related fix you mentioned on (2.): This hasn't been released, yet? Correct? Or which version of "Microsoft.Identity.Web" should we try, in order to test if it fixes the problem?