dotnet / aspnetcore

ASP.NET Core is a cross-platform .NET framework for building modern cloud-based web applications on Windows, Mac, or Linux.
https://asp.net
MIT License
35.37k stars 9.99k forks source link

Concurrency Exception While Rendering Razor Page #50284

Open Mike-E-angelo opened 1 year ago

Mike-E-angelo commented 1 year ago

Is there an existing issue for this?

Describe the bug

I am seeing an intermittent exception being thrown in production. It does not appear to involve my code, but occurs when rendering my _Host.cshtml. I have not seen this until recently and may be due to a NuGet update to 7.0.10.

Expected Behavior

No response

Steps To Reproduce

This is an intermittent exception that has occurred twice now. Unfortunately I am not sure what causes it. As the call stack does not contain any of my custom code I am reporting it here for next steps.

Note that this occurs while rendering a component tag:

<component type="typeof(Starbeam.Integration.Crisp.Presentation.IdentifierContainer)" render-mode="ServerPrerendered" />

Exceptions (if any)

System.InvalidOperationException: Operations that change non-concurrent collections must have exclusive access. A concurrent update was performed on this collection and corrupted its state. The collection's state is no longer correct.
   at ref TValue System.Collections.Generic.Dictionary<TKey, TValue>.FindValue(TKey key)
   at bool System.Collections.Generic.Dictionary<TKey, TValue>.TryGetValue(TKey key, out TValue value)
   at IUrlHelper Microsoft.AspNetCore.Mvc.Routing.UrlHelperFactory.GetUrlHelper(ActionContext context)
   at bool Microsoft.AspNetCore.Mvc.Razor.TagHelpers.UrlResolutionTagHelper.TryResolveUrl(string url, out IHtmlContent resolvedUrl)
   at void Microsoft.AspNetCore.Mvc.Razor.TagHelpers.UrlResolutionTagHelper.ProcessUrlAttribute(string attributeName, TagHelperOutput output)
   at void Microsoft.AspNetCore.Mvc.Razor.TagHelpers.UrlResolutionTagHelper.Process(TagHelperContext context, TagHelperOutput output)
   at Task Microsoft.AspNetCore.Razor.TagHelpers.TagHelper.ProcessAsync(TagHelperContext context, TagHelperOutput output)
   at Task Microsoft.AspNetCore.Razor.Runtime.TagHelpers.TagHelperRunner.RunAsync(TagHelperExecutionContext executionContext)
   at async Task Starbeam.Applications.Gateway.Pages.Pages__Host.ExecuteAsync()+(?) => { } in C:/agent/default/_work/1/s/starbeam.one/Pages/_Host.cshtml:line 84
   at async Task Microsoft.AspNetCore.Razor.Runtime.TagHelpers.TagHelperExecutionContext.SetOutputContentAsync()
   at async Task Starbeam.Applications.Gateway.Pages.Pages__Host.ExecuteAsync()
   at async Task Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderPageCoreAsync(IRazorPage page, ViewContext context)
   at async Task<ViewBufferTextWriter> Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderPageAsync(IRazorPage page, ViewContext context, bool invokeViewStarts)
   at async Task Microsoft.AspNetCore.Mvc.Razor.RazorView.RenderAsync(ViewContext context)
   at async Task Microsoft.AspNetCore.Mvc.ViewFeatures.ViewExecutor.ExecuteAsync(ViewContext viewContext, string contentType, int? statusCode) x 2
   at async Task Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.InvokeResultAsync(IActionResult result)+Logged(?)
   at async Task Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.InvokeNextResultFilterAsync<TFilter, TFilterAsync>()+Awaited(?)
   at void Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.Rethrow(ResultExecutedContextSealed context)
   at Task Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.ResultNext<TFilter, TFilterAsync>(ref State next, ref Scope scope, ref object state, ref bool isCompleted)
   at async Task Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.InvokeResultFilters()+Awaited(?)
   at async Task Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.InvokeNextResourceFilter()+Awaited(?)
   at void Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.Rethrow(ResourceExecutedContextSealed context)
   at Task Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.Next(ref State next, ref Scope scope, ref object state, ref bool isCompleted)
   at async Task Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.InvokeFilterPipelineAsync()+Awaited(?)
   at async Task Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.InvokeAsync()+Logged(?) x 2
   at async Task Microsoft.AspNetCore.Routing.EndpointMiddleware.Invoke(HttpContext httpContext)+AwaitRequestTask(?)
   at async Task Microsoft.AspNetCore.Authorization.AuthorizationMiddleware.Invoke(HttpContext context)
   at async Task Microsoft.AspNetCore.Authentication.AuthenticationMiddleware.Invoke(HttpContext context)
   at async Task<bool> Microsoft.AspNetCore.Server.IIS.Core.IISHttpContextOfT<TContext>.ProcessRequestAsync()

.NET Version

8.0.100-preview.7.23376.3 locally but 7.0.10 on the projects

Anything else?

Thank you for all the amazing work out there. :)

SteveSandersonMS commented 1 year ago

Since this issue comes from MVC's UrlHelperFactory and doesn't involve any Blazor code (Blazor doesn't start until after the <component> tag helper runs), I'll move this back to area-mvc.

@Mike-E-angelo The error occurs when UrlHelperFactory.GetUrlHelper calls httpContext.Items.TryGetValue. This may be caused by your own application code, since you might have some unrelated other code, running on another thread, that mutates HttpContext.Items at the same time (well, microscopically earlier) and hence leads to the concurrent update. That might be the most likely explanation since I can't find any evidence that anyone else has ever had an exception from this location in the framework.

Not sure if this is something we can investigate without a repro but will leave this open here for a team triage.

Mike-E-angelo commented 1 year ago

Thank you @SteveSandersonMS for the suggestion it is a solid one. I did take a look at my solution and -- surprisingly -- I am not making any use of HttpContext.Items. The only use I can see is from an external library for markdown files:

image

I understand the need for a reproduction but unfortunately, I do not think I will be able to provide one. I would, however, like to work with the team to see if there is any way that I can track this down with logging (if possible).

Mike-E-angelo commented 1 year ago

FWIW two other stack traces occurring here that may offer clues:

System.InvalidOperationException: Operations that change non-concurrent collections must have exclusive access. A concurrent update was performed on this collection and corrupted its state. The collection's state is no longer correct.
   at ref TValue System.Collections.Generic.Dictionary<TKey, TValue>.FindValue(TKey key)
   at bool System.Collections.Generic.Dictionary<TKey, TValue>.TryGetValue(TKey key, out TValue value)
   at SaveTempDataContext Microsoft.AspNetCore.Mvc.ViewFeatures.Filters.SaveTempDataFilter.GetTempDataContext(HttpContext httpContext)
   at Task Microsoft.AspNetCore.Mvc.ViewFeatures.Filters.SaveTempDataFilter.OnStarting(HttpContext httpContext)
   at async Task Microsoft.AspNetCore.Server.IIS.Core.IISHttpContext.FireOnStarting()
System.InvalidOperationException: Operations that change non-concurrent collections must have exclusive access. A concurrent update was performed on this collection and corrupted its state. The collection's state is no longer correct.
   at ref TValue System.Collections.Generic.Dictionary<TKey, TValue>.FindValue(TKey key)
   at bool System.Collections.Generic.Dictionary<TKey, TValue>.TryGetValue(TKey key, out TValue value)
   at SaveTempDataContext Microsoft.AspNetCore.Mvc.ViewFeatures.Filters.SaveTempDataFilter.GetTempDataContext(HttpContext httpContext)
   at void Microsoft.AspNetCore.Mvc.ViewFeatures.Filters.SaveTempDataFilter.OnResourceExecuted(ResourceExecutedContext context)
   at Task Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.Next(ref State next, ref Scope scope, ref object state, ref bool isCompleted)
   at async Task Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.InvokeFilterPipelineAsync()+Awaited(?)
   at async Task Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.InvokeAsync()+Logged(?) x 2
   at async Task Microsoft.AspNetCore.Routing.EndpointMiddleware.Invoke(HttpContext httpContext)+AwaitRequestTask(?)
   at async Task Microsoft.AspNetCore.Authorization.AuthorizationMiddleware.Invoke(HttpContext context)
   at async Task Microsoft.AspNetCore.Authentication.AuthenticationMiddleware.Invoke(HttpContext context)
   at async Task<bool> Microsoft.AspNetCore.Server.IIS.Core.IISHttpContextOfT<TContext>.ProcessRequestAsync()
Mike-E-angelo commented 1 year ago

Has no one else reported this? 😭 This appears to be occurring in the MVC infrastructure in the SaveTempDataFilter as seen in the above stacktraces. I am not sure how my code would be responsible for this and I am open to suggestions on what I can do to prevent this. Thank you for any assistance you can provide.

Mike-E-angelo commented 1 year ago

Hi @martincostello can you verify this issue is being tracked appropriately? I noticed that all areas have been removed from this issue and it seems it's been orphaned. Thank you for any assistance.

I wanted to follow up on this issue. I've noticed that this only occurs with bots loading pages, which explains why I have not been able to reproduce it easily on my end.

There seems to be something the bots are doing when loading my pages that leads to this condition.

The stack traces above seem to show the code being run as part of MVC infrastructure with none of my code involved. I would greatly appreciate any guidance on how I could perhaps use logging to further track down what is occurring here as I cannot reproduce this on my own machine or in any of my testing.

Note that I am running this in Azure, so if there's any configuration to be had to track this down there I can do that as well.

Thank you for any assistance you can provide.

Mike-E-angelo commented 1 year ago

Weird enough I now see area-mvc attached to this issue this morning. 🤦‍♂️

Thank you for any assistance you can provide.

Mike-E-angelo commented 1 year ago

Thank you for any assistance you can provide. 🙏

Mike-E-angelo commented 1 year ago

Thank you for any assistance you can provide. 🙏

Mike-E-angelo commented 1 year ago

Thank you for any assistance you can provide. 🙏

Mike-E-angelo commented 1 year ago

Thank you for any assistance/guidance you can provide on this issue. 🙏

Mike-E-angelo commented 12 months ago

FWIW I have opened a question on StackOverflow with this issue, with no luck so far. I have put up a bounty with roughly ~10% of my reputation to help solve this rather pesky problem in my world:

https://stackoverflow.com/questions/77345069/what-would-cause-this-error-for-asp-net-core-mvc-bot-based-requests-operations

Thank you for any assistance you can provide.

marcmarsinach commented 10 months ago

Same problem here after upgrade to .NET7 from .NET6

image

Mike-E-angelo commented 10 months ago

WOW! Someone else gets this too. 😌 I have been spending a good deal of my time trying to track this down as I still do not know where or how this occurs, just that it does. So glad to see this confirmed by someone else and it's not something specific to my environment/configuration.

Hopefully we can figure out what this is. 🤞

marcmarsinach commented 10 months ago

We have solved the problem! It is not related to NET7. We have a Serilog custom enricher to manage our correlation ids. We made a change to store the correlation id in the HttpContext.Items collection due to performance reasons, but seems not to be a good idea. Is it possible to make this collection compatible with multi-threading scenarios?

image

Mike-E-angelo commented 10 months ago

Glad you were able to figure it out, @marcmarsinach. Unfortunately, I do not have anything obvious like this in my solution and I do not make use of the HttpContext.Items anywhere in my custom code.

So in my case, this appears to be a Very Bad™ environmental/configuration-based issue that I continue to track down.

Mike-E-angelo commented 10 months ago

Funny enough after my last post I got the exception occurring with my new logging and have more information here.

This appears to be occurring during a UrlResolutionTagHelper.Process call:

System.InvalidOperationException: Operations that change non-concurrent collections must have exclusive access. A concurrent update was performed on this collection and corrupted its state. The collection's state is no longer correct.
   at ref TValue System.Collections.Generic.Dictionary<TKey, TValue>.FindValue(TKey key)
   at bool System.Collections.Generic.Dictionary<TKey, TValue>.TryGetValue(TKey key, out TValue value)
   at IUrlHelper Microsoft.AspNetCore.Mvc.Routing.UrlHelperFactory.GetUrlHelper(ActionContext context)
   at bool Microsoft.AspNetCore.Mvc.Razor.TagHelpers.UrlResolutionTagHelper.TryResolveUrl(string url, out IHtmlContent resolvedUrl)
   at void Microsoft.AspNetCore.Mvc.Razor.TagHelpers.UrlResolutionTagHelper.ProcessUrlAttribute(string attributeName, TagHelperOutput output)
   at void Microsoft.AspNetCore.Mvc.Razor.TagHelpers.UrlResolutionTagHelper.Process(TagHelperContext context, TagHelperOutput output)
   at Task Microsoft.AspNetCore.Razor.TagHelpers.TagHelper.ProcessAsync(TagHelperContext context, TagHelperOutput output)
   at Task Microsoft.AspNetCore.Razor.Runtime.TagHelpers.TagHelperRunner.RunAsync(TagHelperExecutionContext executionContext)
   at async Task Starbeam.Applications.Gateway.Pages.Pages__Host.ExecuteAsync()+(?) => { } in C:/agent/default/_work/1/s/starbeam.one/Pages/_Host.cshtml:line 84
   at async Task Microsoft.AspNetCore.Razor.Runtime.TagHelpers.TagHelperExecutionContext.SetOutputContentAsync()
   at async Task Starbeam.Applications.Gateway.Pages.Pages__Host.ExecuteAsync()

Additionally, there appears to be access to this collection in AuthorizationMiddleware during this time:

System.InvalidOperationException: Operations that change non-concurrent collections must have exclusive access. A concurrent update was performed on this collection and corrupted its state. The collection's state is no longer correct.
   at bool System.Collections.Generic.Dictionary<TKey, TValue>.TryInsert(TKey key, TValue value, InsertionBehavior behavior)
   at void System.Collections.Generic.Dictionary<TKey, TValue>.set_Item(TKey key, TValue value)
   at async Task Microsoft.AspNetCore.Authorization.AuthorizationMiddleware.Invoke(HttpContext context)
   at async Task Microsoft.AspNetCore.Authentication.AuthenticationMiddleware.Invoke(HttpContext context)
   at async Task Starbeam.Applications.Gateway.ConcurrentExceptionAwareMiddleware.InvokeAsync(HttpContext context) in C:/agent/default/_work/1/s/starbeam.one/ConcurrentExceptionAwareMiddleware.cs:line 29
   at async Task Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddlewareImpl.HandleException(HttpContext context, ExceptionDispatchInfo edi)
Mike-E-angelo commented 10 months ago

FWIW I am adding a middleware to help diagnose this issue:

sealed class ConcurrentExceptionAwareMiddleware
{
    readonly RequestDelegate _next;
    readonly Warning         _warning;

    public ConcurrentExceptionAwareMiddleware(RequestDelegate next, Warning warning)
    {
        _next    = next;
        _warning = warning;
    }

    public async Task InvokeAsync(HttpContext context)
    {
        try
        {
            context.Items = new ConcurrencyStateAwareDictionary(context);
            await _next(context).ConfigureAwait(false);
        }
        catch (InvalidOperationException exception) when
            (exception.Message.StartsWith("Operations that change non-concurrent collections"))
        {
            _warning.Execute(exception);
            throw;
        }
    }

    public sealed class Warning : LogWarningException
    {
        public Warning(ILogger<Warning> logger) : base(logger, "An unexpected concurrency exception has occurred") {}
    }
}

I am thinking for you @marcmarsinach you could do something like the following:

sealed class ConcurrentItemsMiddleware
{
    readonly RequestDelegate _next;

    public ConcurrentItemsMiddleware(RequestDelegate next) => _next = next;

    public Task InvokeAsync(HttpContext context)
    {
        context.Items = new ConcurrentDictionary<object, object?>(context.Items);
        return _next(context);
    }
}

... and it would solve your issue? 🤔 I am curious if I am overlooking something obvious here and would appreciate validation of my understanding.

Additionally, I have been logging the usage of this dictionary and I am getting the following keys added per request:

Assign: "__AuthorizationMiddlewareWithEndpointInvoked"
Add: {object} (from SaveTempDataFilter)
Add: {Name = "ITempDataDictionary" FullName = "Microsoft.AspNetCore.Mvc.ViewFeatures.ITempDataDictionary"}
Assign: {Name = "IUrlHelper" FullName = "Microsoft.AspNetCore.Mvc.IUrlHelper"}
Add: "Serilog_ClientIp"
Add: "Serilog_CorrelationId"
Add: "Serilog_User-Agent"

My hunch is that the Serilog is causing this, but it is unclear why, when, and how. I continue to add further diagnostics to determine this.

ConcurrencyStateAwareDictionary is the following, if interested (feedback welcome):

public sealed class ConcurrencyStateAwareDictionary : ConcurrencyAwareDictionary<object, object?>
{
    public ConcurrencyStateAwareDictionary(HttpContext context)
        : this(context.Items, context.RequestServices.GetRequiredService<ILogger<ConcurrencyStateAwareDictionary>>()) {}

    public ConcurrencyStateAwareDictionary(IDictionary<object, object?> previous, ILogger logger)
        : base(previous, logger) {}
}

public class ConcurrencyAwareDictionary<TKey, TValue> : IDictionary<TKey, TValue> where TKey : notnull
{
    bool                               _error;
    readonly IDictionary<TKey, TValue> _previous;
    readonly ILogger                   _logger;

    public ConcurrencyAwareDictionary(IDictionary<TKey, TValue> previous, ILogger logger)
    {
        _previous = previous;
        _logger   = logger;
    }

    public IEnumerator<KeyValuePair<TKey, TValue>> GetEnumerator() => _previous.GetEnumerator();

    IEnumerator IEnumerable.GetEnumerator() => ((IEnumerable)this).GetEnumerator();

    public void Add(KeyValuePair<TKey, TValue> item)
    {
        _previous.Add(item);
    }

    public void Clear()
    {
        _previous.Clear();
    }

    public bool Contains(KeyValuePair<TKey, TValue> item) => _previous.Contains(item);

    public void CopyTo(KeyValuePair<TKey, TValue>[] array, int arrayIndex)
    {
        _previous.CopyTo(array, arrayIndex);
    }

    public bool Remove(KeyValuePair<TKey, TValue> item) => _previous.Remove(item);

    public int Count => _previous.Count;

    public bool IsReadOnly => _previous.IsReadOnly;

    public void Add(TKey key, TValue value)
    {
        var error = _error;
        _previous.Add(key, value);
        if (!error && _error)
        {
            var trace = new System.Diagnostics.StackTrace(true);
            _logger.LogError("Error state occurred during add {Trace}", trace.ToString());
        }
    }

    public bool ContainsKey(TKey key) => _previous.ContainsKey(key);

    public bool Remove(TKey key) => _previous.Remove(key);

    public bool TryGetValue(TKey key, out TValue value)
    {
        try
        {
            return _previous.TryGetValue(key, out value!);
        }
        catch (InvalidOperationException e) when
            (e.Message.StartsWith("Operations that change non-concurrent collections"))
        {
            _error = true;
            throw;
        }
    }

    public TValue this[TKey key]
    {
        get => _previous[key];
        set
        {
            var error = _error;
            _previous[key] = value;
            if (!error && _error)
            {
                var trace = new System.Diagnostics.StackTrace(true);
                _logger.LogError("Error state occurred during assignment {Trace}", trace.ToString());
            }
        }
    }

    public ICollection<TKey> Keys => _previous.Keys;

    public ICollection<TValue> Values => _previous.Values;
}
Mike-E-angelo commented 8 months ago

Nice to see someone actually acknowledge this issue 😏

ashleyjlive commented 6 months ago

I have also seen this bug. It is fairly sparse and is very hard to reproduce locally. I have seen it locally only on startup, and subsequent requests succeed. We're also using Serilog.