OrchardCMS / OrchardCore

Orchard Core is an open-source modular and multi-tenant application framework built with ASP.NET Core, and a content management system (CMS) built on top of that framework.
https://orchardcore.net
BSD 3-Clause "New" or "Revised" License
7.37k stars 2.38k forks source link

Filtering Content Items causes error 500 page intermittently - "pending requests working on this transaction" #8466

Closed mariojsnunes closed 3 years ago

mariojsnunes commented 3 years ago

Hello,

When my app is deployed on an Azure Linux App Service, with Azure Sql, I've been getting this error intermittently when filtering Content Items on the backoffice. I've see it happen when I filter by Content Type or Publish Status, or Sort. Since it's not happening in localhost, my theory atm is that it's because the Azure machine is slower.

Microsoft.Data.SqlClient.SqlException (0x80131904): The transaction operation cannot be performed because there are pending requests working on this transaction. at Microsoft.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction) at Microsoft.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction) at Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) at Microsoft.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady) at Microsoft.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj) at Microsoft.Data.SqlClient.TdsParser.TdsExecuteTransactionManagerRequest(Byte[] buffer, TransactionManagerRequestType request, String transactionName, TransactionManagerIsolationLevel isoLevel, Int32 timeout, SqlInternalTransaction transaction, TdsParserStateObject stateObj, Boolean isDelegateControlRequest) at Microsoft.Data.SqlClient.SqlInternalConnectionTds.ExecuteTransactionYukon(TransactionRequest transactionRequest, String transactionName, IsolationLevel iso, SqlInternalTransaction internalTransaction, Boolean isDelegateControlRequest) at Microsoft.Data.SqlClient.SqlInternalTransaction.Commit() at Microsoft.Data.SqlClient.SqlTransaction.Commit() at System.Data.Common.DbTransaction.CommitAsync(CancellationToken cancellationToken) --- End of stack trace from previous location where exception was thrown --- at YesSql.Session.CommitTransactionAsync() at YesSql.Session.CommitTransactionAsync() at YesSql.Session.CommitAsync() at OrchardCore.Data.Documents.DocumentStore.CommitAsync() at OrchardCore.Environment.Shell.Scope.ShellScope.BeforeDisposeAsync() at OrchardCore.Environment.Shell.Scope.ShellScope.UsingAsync(Func2 execute, Boolean activateShell) at OrchardCore.Modules.ModularTenantContainerMiddleware.Invoke(HttpContext httpContext) at Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddleware.<Invoke>g__Awaited|6_0(ExceptionHandlerMiddleware middleware, HttpContext context, Task task) at Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddleware.HandleException(HttpContext context, ExceptionDispatchInfo edi) at Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddleware.<Invoke>g__Awaited|6_0(ExceptionHandlerMiddleware middleware, HttpContext context, Task task) at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.HttpProtocol.ProcessRequests[TContext](IHttpApplication1 application)

While looking through the source code, I found these transactions which are missing transaction.Commit().

https://github.com/OrchardCMS/OrchardCore/blob/4c71d093a728d726bea24e0024ad9b58a9749aa8/src/OrchardCore.Modules/OrchardCore.Queries/Sql/SqlQuerySource.cs#L72-L75

https://github.com/OrchardCMS/OrchardCore/blob/4c71d093a728d726bea24e0024ad9b58a9749aa8/src/OrchardCore.Modules/OrchardCore.Queries/Sql/SqlQuerySource.cs#L89-L92

These queries might not even need a transaction at all, since they are only reading data.

Could this be the cause? Is this happened to anyone else? I'm using the latest bits. Not sure if it's a new issue or not, but I think I remember already having this issue a few months back.

jtkech commented 3 years ago

I think you are using the last dev packages as i can see CommitTransactionAsync() in your trace, which is a new async method of the yesSql session. One thing i noticed is that CommitTransactionAsync() is called twice.

These queries might not even need a transaction at all, since they are only reading data.

We added it through #4698 to fix a db locking issue #4669

But i think that we don't use these queries when filtering content items through the admin

And not sure they need a transaction commit as they are only used to read, but maybe better to do ;)

Do you have any custom admin filters in your code? That would contains a non awaited task

Need more thinkings

deanmarcussen commented 3 years ago

at YesSql.Session.CommitTransactionAsync() at YesSql.Session.CommitTransactionAsync()

@jtkech so the IAsyncDisposable was implemented, but I wonder if it is getting hit twice, as we are still using the deferred task to CommitAsync().

Should be that it isn't called, i.e. noops when already commited.

Needs more thinking as you say, but it also maybe YesSql, so more thought indeed ;)

jtkech commented 3 years ago

@deanmarcussen

The session is committed just before the shell scope is disposed, through a BeforeDispose delegate registered when the ISession is resolved, not through a midldeware or a deferred task, see at the end of my comment here https://github.com/OrchardCMS/OrchardCore/pull/8424#issuecomment-770330766

Then the session is commited again when disposed, but normally here, the flush and the 2nd transaction commit are noops (already committed). Hmm, but how a 2nd CommitTransactionAsync() can be called, we don't use an await using on the session, and when disposing the shell scope we do _serviceScope.Dispose(), we don't use yet DisposeAsync() on the underlying engine that now is an IAsyncDisposable (we could do it, idem with the SP of the parent shell).

So the session is disposed through Dispose() that calls CommitTransaction(), not CommitTransactionAsync().

So still need more thinkings ;)


@mariojsnunes

I can't repro with a local sql server with thousands of items, so maybe due to a slow machine and / or on linux.

Can you check if you have the same issue with an older version of OrchardCore, e.g. one month ago. Hmm, maybe hard to see if it is intermittent.

mariojsnunes commented 3 years ago

Okay so a quick update... and sorry if this is confusing, I'm having a hard time correlating the error messages.

  1. I rolled back to 1.0.0-rc2-15627, I can still see the error in the log (non-async version), even with a more powerful machine (P1V2) Linux Azure App Service.

The exception:

Microsoft.Data.SqlClient.SqlException (0x80131904): The transaction operation cannot be performed because there are pending requests working on this transaction. at Microsoft.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction) at Microsoft.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction) at Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) at Microsoft.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady) at Microsoft.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj) at Microsoft.Data.SqlClient.TdsParser.TdsExecuteTransactionManagerRequest(Byte[] buffer, TransactionManagerRequestType request, String transactionName, TransactionManagerIsolationLevel isoLevel, Int32 timeout, SqlInternalTransaction transaction, TdsParserStateObject stateObj, Boolean isDelegateControlRequest) at Microsoft.Data.SqlClient.SqlInternalConnectionTds.ExecuteTransactionYukon(TransactionRequest transactionRequest, String transactionName, IsolationLevel iso, SqlInternalTransaction internalTransaction, Boolean isDelegateControlRequest) at Microsoft.Data.SqlClient.SqlInternalConnectionTds.ExecuteTransaction(TransactionRequest transactionRequest, String name, IsolationLevel iso, SqlInternalTransaction internalTransaction, Boolean isDelegateControlRequest) at Microsoft.Data.SqlClient.SqlInternalTransaction.Commit() at Microsoft.Data.SqlClient.SqlTransaction.Commit() at YesSql.Session.CommitTransaction() at YesSql.Session.CommitAsync() at OrchardCore.Data.Documents.DocumentStore.CommitAsync() at OrchardCore.Environment.Shell.Scope.ShellScope.BeforeDisposeAsync() at OrchardCore.Environment.Shell.Scope.ShellScope.UsingAsync(Func2 execute, Boolean activateShell) at OrchardCore.Modules.ModularTenantContainerMiddleware.Invoke(HttpContext httpContext) at Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddleware.<Invoke>g__Awaited|6_0(ExceptionHandlerMiddleware middleware, HttpContext context, Task task) at Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddleware.HandleException(HttpContext context, ExceptionDispatchInfo edi) at Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddleware.<Invoke>g__Awaited|6_0(ExceptionHandlerMiddleware middleware, HttpContext context, Task task) at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.HttpProtocol.ProcessRequests[TContext](IHttpApplication1 application) ClientConnectionId:24938a40-685e-4013-9494-343f4f7f678d Error Number:3981,State:1,Class:16 ClientConnectionId before routing:bff7f7b5-c658-4457-ad0e-a82248d359f3

  1. The 500 page appears to be related to my SPA setup, as it also throws the error "could not find '/index'", I'll try the https://github.com/OrchardCMS/OrchardCore/issues/7045#issuecomment-703293308 solution and see if it helps (will take some work). This error message is immediately preceded and followed by the "transaction" exception (in the same 10th of a second):

2021-02-01 12:43:26.9783||0HM66LRGDDTDV:00000001|Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddleware|ERR|An unhandled exception has occurred while executing the request. > Microsoft.Data.SqlClient.SqlException (0x80131904): The transaction operation cannot be performed because there are pending requests working on this transaction. at Microsoft.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction) at Microsoft.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction) at Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) at Microsoft.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady) at Microsoft.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj) at Microsoft.Data.SqlClient.TdsParser.TdsExecuteTransactionManagerRequest(Byte[] buffer, TransactionManagerRequestType request, String transactionName, TransactionManagerIsolationLevel isoLevel, Int32 timeout, SqlInternalTransaction transaction, TdsParserStateObject stateObj, Boolean isDelegateControlRequest) at Microsoft.Data.SqlClient.SqlInternalConnectionTds.ExecuteTransactionYukon(TransactionRequest transactionRequest, String transactionName, IsolationLevel iso, SqlInternalTransaction internalTransaction, Boolean isDelegateControlRequest) at Microsoft.Data.SqlClient.SqlInternalConnectionTds.ExecuteTransaction(TransactionRequest transactionRequest, String name, IsolationLevel iso, SqlInternalTransaction internalTransaction, Boolean isDelegateControlRequest) at Microsoft.Data.SqlClient.SqlInternalTransaction.Commit() at Microsoft.Data.SqlClient.SqlTransaction.Commit() at YesSql.Session.CommitTransaction() at YesSql.Session.CommitAsync() at OrchardCore.Data.Documents.DocumentStore.CommitAsync() at OrchardCore.Environment.Shell.Scope.ShellScope.BeforeDisposeAsync() at OrchardCore.Environment.Shell.Scope.ShellScope.UsingAsync(Func`2 execute, Boolean activateShell) at OrchardCore.Modules.ModularTenantContainerMiddleware.Invoke(HttpContext httpContext) at Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddleware.g__Awaited|6_0(ExceptionHandlerMiddleware middleware, HttpContext context, Task task) ClientConnectionId:e7146158-9b3f-43fb-ae2e-1d8a6a71da2c Error Number:3981,State:1,Class:16 ClientConnectionId before routing:4a58298c-951e-4abf-83cb-f909ebfe28d6 2021-02-01 12:43:27.0957||0HM66LRGDDTDV:00000001|Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddleware|ERR|An exception was thrown attempting to execute the error handler. > System.InvalidOperationException: The SPA default page middleware could not return the default page '/index.html' because it was not found, and no other middleware handled the request. Your application is running in Production mode, so make sure it has been published, or that you have built your SPA manually. Alternatively you may wish to switch to the Development environment.

at Microsoft.AspNetCore.SpaServices.SpaDefaultPageMiddleware.<>cDisplayClass0_0.b1(HttpContext context, Func`1 next) at Microsoft.AspNetCore.Builder.UseExtensions.<>cDisplayClass0_1.b1(HttpContext context) at Microsoft.AspNetCore.StaticFiles.StaticFileMiddleware.Invoke(HttpContext context) at Microsoft.AspNetCore.Builder.UseExtensions.<>cDisplayClass0_2.b2() at Microsoft.AspNetCore.SpaServices.SpaDefaultPageMiddleware.<>cDisplayClass0_0.b0(HttpContext context, Func1 next) at Microsoft.AspNetCore.Builder.UseExtensions.<>c__DisplayClass0_1.<Use>b__1(HttpContext context) at Microsoft.AspNetCore.Builder.UseWhenExtensions.<>c__DisplayClass0_1.<UseWhen>b__1(HttpContext context) at Microsoft.AspNetCore.StaticFiles.StaticFileMiddleware.Invoke(HttpContext context) at Microsoft.AspNetCore.StaticFiles.StaticFileMiddleware.Invoke(HttpContext context) at Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddleware.HandleException(HttpContext context, ExceptionDispatchInfo edi) 2021-02-01 12:43:27.1068||0HM66LRGDDTDV:00000001|Microsoft.AspNetCore.Server.Kestrel|ERR|Connection id "0HM66LRGDDTDV", Request id "0HM66LRGDDTDV:00000001": An unhandled exception was thrown by the application. **> Microsoft.Data.SqlClient.SqlException (0x80131904): The transaction operation cannot be performed because there are pending requests working on this transaction.** at Microsoft.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction) at Microsoft.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction) at Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) at Microsoft.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady) at Microsoft.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj) at Microsoft.Data.SqlClient.TdsParser.TdsExecuteTransactionManagerRequest(Byte[] buffer, TransactionManagerRequestType request, String transactionName, TransactionManagerIsolationLevel isoLevel, Int32 timeout, SqlInternalTransaction transaction, TdsParserStateObject stateObj, Boolean isDelegateControlRequest) at Microsoft.Data.SqlClient.SqlInternalConnectionTds.ExecuteTransactionYukon(TransactionRequest transactionRequest, String transactionName, IsolationLevel iso, SqlInternalTransaction internalTransaction, Boolean isDelegateControlRequest) at Microsoft.Data.SqlClient.SqlInternalConnectionTds.ExecuteTransaction(TransactionRequest transactionRequest, String name, IsolationLevel iso, SqlInternalTransaction internalTransaction, Boolean isDelegateControlRequest) at Microsoft.Data.SqlClient.SqlInternalTransaction.Commit() at Microsoft.Data.SqlClient.SqlTransaction.Commit() at YesSql.Session.CommitTransaction() at YesSql.Session.CommitAsync() at OrchardCore.Data.Documents.DocumentStore.CommitAsync() at OrchardCore.Environment.Shell.Scope.ShellScope.BeforeDisposeAsync() at OrchardCore.Environment.Shell.Scope.ShellScope.UsingAsync(Func2 execute, Boolean activateShell) at OrchardCore.Modules.ModularTenantContainerMiddleware.Invoke(HttpContext httpContext) at Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddleware.g__Awaited|6_0(ExceptionHandlerMiddleware middleware, HttpContext context, Task task) at Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddleware.HandleException(HttpContext context, ExceptionDispatchInfo edi) at Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddleware.g__Awaited|6_0(ExceptionHandlerMiddleware middleware, HttpContext context, Task task) at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.HttpProtocol.ProcessRequests[TContext](IHttpApplication`1 application) ClientConnectionId:e7146158-9b3f-43fb-ae2e-1d8a6a71da2c Error Number:3981,State:1,Class:16 ClientConnectionId before routing:4a58298c-951e-4abf-83cb-f909ebfe28d6

  1. EDIT: Okay the UseSpa isn't the cause, I've just removed that code and I still get the Orchard error page and the same "transaction" exception, this time it only logs the exception once per error page. image

Next I'll try to reproduce on a Windows App Service and then on a clean OC installation.

mariojsnunes commented 3 years ago

Okay so, the issue also happens on a Windows App Service.

I was also able to reproduce locally while connecting to an Azure SQL Database. Not sure if related... happens with MultipleActiveResultSets=true on the ConnectionString, otherwise it throws a the "The connection does not support MultipleActiveResultSets." error.

deanmarcussen commented 3 years ago

ok, so we've seen errors like this before.

Generally it occurs when an async database query is not awaited correctly.

Have you got any custom code anywhere where you do any session work, and use any .Result or .GetAwaiter().GetResult() ? where you could be using await query...

mariojsnunes commented 3 years ago

Okay so I had a database access using .GetAwaiter().GetResult() on a feature's Startup.cs, so I could add custom Authentication Providers dynamically. I couldn't repro the issue after commenting out that code. Yeay! Even looks like the MARS issue is gone!

Please don't close this yet, I want to do more extensive testing tomorrow as the issue was intermittent.

Thanks a lot!!

jtkech commented 3 years ago

Yes, most of the time the MARS issues we have are because of a non awaited task, but normally it's more like when you miss an await than using GetAwaiter().

@deanmarcussen we are using it do get the ContentDefinitionRecord, at some point i made it async but Seb didn't want for perf because it is cached, so we can keep it non async to have less allocated Task. Hmm, but interesting to know if a synchronous query may cause this kind of issue.

private ContentDefinitionRecord LoadContentDefinitionRecord() =>
    _contentDefinitionStore.LoadContentDefinitionAsync().GetAwaiter().GetResult();

@mariojsnunes

I had a database access using .GetAwaiter().GetResult() on a feature's Startup.cs

How do you do this? in a module startup ConfigureServices() you would have to build a service provider, and a service scope (in fact a shell scope) to resolve the ISession scoped service, which is not good here as we are building the tenant / shell container. In a module startup Configure(), yes a shell scope service provider is available

deanmarcussen commented 3 years ago

Yes, I think it would be helpful to see the code that was in use (if indeed it did fix the issue)

@jtkech yes, and / or when an async void is used, as we saw with the content definition builders (? from memory). @mariojsnunes see here : https://github.com/OrchardCMS/OrchardCore/issues/6919 and the fix.

Personally I don't think there should be any use of MARS in a connection string , as we use none of the MARS features of Dapper, .i.e. multiple queries during one query. I stand to be proved wrong there, but so far every time someone has needed to turn on MARS, it's because of some other underlying usage of async/await that we (or the user) has not quite got right.

In terms of using the session inside a Startup I would tend to try and avoid it, and move what it is doing to an IConfigureOptions - still not async, but at least defers to execution to when the options class is resolved - when the service requesting it is required. A small optimization, and maybe not necesary, but what I tend to prefer.

Would be great to see the code you are using here @mariojsnunes just to understand if there is an underlying .GetAwaiter().GetResult() issue we can understand more.

mariojsnunes commented 3 years ago

Okay, just a quick update, the issue is not fixed yet...

@jtkech

How do you do this? in a module startup ConfigureServices() you would have to build a service provider, and a service scope (in fact a shell scope) to resolve the ISession scoped service, which is not good here as we are building the tenant / shell container. In a module startup Configure(), yes a shell scope service provider is available

Yeah that was it, was the bad "solution" found at the time. If I recall correctly, we run into a limitation and couldn't add Auth providers on the Configure() method - will check it again, but anyway that code is commented out, so the issue has to be elsewhere.

So we turned the remaining GetAwaiter() usage into await. And also double checked for async void and non awaitedTask and found nothing (We previously had an issue with this too)

Still the same issue occurs... the "transaction" exception with MARS on the connectionString, otherwise just the MARS exception.

I'm still looking into a few things... one suspect is that the Features' ConfigureServices() methods are being triggered 3 times, doesn't seem right to me, need to read on it.

Will update when I have more info, thanks!

mariojsnunes commented 3 years ago

Okay, found the issue!

It was actually a missing await as suspected... Was hard to find because there was no warning. What helped was Searching the whole solution for "private Task", "public Task" etc...

authBuilder.AddCookie(CookieAuthenticationDefaults.AuthenticationScheme, options =>
{
    options.LoginPath = "/login";
    options.Events = new CookieAuthenticationEvents
    {
        OnValidatePrincipal = OnValidatePrincipalAsync
    };
    options.ExpireTimeSpan = TimeSpan.FromDays(15);
});

private async Task OnValidatePrincipalAsync(CookieValidatePrincipalContext context) { await SecurityStampValidator.ValidatePrincipalAsync(context); return Task.FromResult(0); }

Thanks so much for the help!

jtkech commented 3 years ago

@mariojsnunes Cool, thanks, good to know that it was a non awaited task