Open halter73 opened 1 year ago
This test hasn't failed in the last 30 days.
@BrennanConroy would this have been fixed by https://github.com/dotnet/aspnetcore/pull/55738?
Unlikely, like Stephen said:
I'm convince this is a test-only issue that's only possible because we use AllowSynchronousContinuations = true on the Http2FrameWriter channel in Kestrel's in-memory tests
Failing Test(s)
Microsoft.AspNetCore.Server.Kestrel.Core.Tests.Http2ConnectionTests.OutputFlowControl_ConnectionAndRequestAborted_NoException
Error Message
Helix just reports "The Helix Work Item failed. Often this is due to a test crash. Please see the 'Artifacts' tab above for additional logs." for the InMemory.FunctionalTests--net7.0 Work Item.
This is a deadlock with two relevant stack traces collected from the hangdump using
clrstack -all
.After a quick look at that stack traces, I'm convince this is a test-only issue that's only possible because we use
AllowSynchronousContinuations = true
on the Http2FrameWriter channel in Kestrel's in-memory tests. I tried settingAllowSynchronousContinuations = false
to see how many tests failed, and I saw these 11 test failures immediately. There could more than that that would become flaky with this change.Test failures without AllowSynchronousContinuations
``` Test Class Duration Error Message Microsoft.AspNetCore.Server.Kestrel.Core.Tests.Http2ConnectionTests.StreamPool_MultipleStreamsInSequence_PooledStreamReused Failed Http2ConnectionTests 4 ms Assert.Equal() Failure Expected: 1 Actual: 0 Microsoft.AspNetCore.Server.Kestrel.Core.Tests.Http2ConnectionTests.StreamPool_SingleStream_ReturnedToPool Failed Http2ConnectionTests 4 ms Assert.Equal() Failure Expected: 1 Actual: 0 Microsoft.AspNetCore.Server.Kestrel.Core.Tests.Http2StreamTests.AbortAfterCompleteAsync_GETWithResponseBodyAndTrailers_ResetsAfterResponse Failed Http2StreamTests 4 ms Assert.True() Failure Expected: True Actual: False Microsoft.AspNetCore.Server.Kestrel.Core.Tests.Http2StreamTests.AbortAfterCompleteAsync_POSTWithResponseBodyAndTrailers_RequestBodyThrows Failed Http2StreamTests 9 ms Assert.Equal() Failure Expected: INTERNAL_ERROR Actual: NO_ERROR Microsoft.AspNetCore.Server.Kestrel.Core.Tests.Http2StreamTests.BodyWriterWriteAsync_OnCanceledPendingFlush_ReturnsResultWithIsCanceled Failed Http2StreamTests 4 ms Assert.Equal() Failure Expected: 6 Actual: 12 Microsoft.AspNetCore.Server.Kestrel.Core.Tests.Http2StreamTests.CompleteAsync_AfterBodyStarted_WithTrailers_SendsBodyAndTrailersWithEndStream Failed Http2StreamTests 3 ms Assert.True() Failure Expected: True Actual: False Microsoft.AspNetCore.Server.Kestrel.Core.Tests.Http2StreamTests.CompleteAsync_AfterPipeWrite_WithTrailers_SendsBodyAndTrailersWithEndStream Failed Http2StreamTests 4 ms Assert.True() Failure Expected: True Actual: False Microsoft.AspNetCore.Server.Kestrel.Core.Tests.Http2StreamTests.CompleteAsync_BeforeBodyStarted_WithTrailers_SendsHeadersAndTrailersWithEndStream Failed Http2StreamTests 3 ms Assert.True() Failure Expected: True Actual: False Microsoft.AspNetCore.Server.Kestrel.Core.Tests.Http2StreamTests.ResetAfterCompleteAsync_GETWithResponseBodyAndTrailers_ResetsAfterResponse Failed Http2StreamTests 5 sec System.TimeoutException : The operation has timed out. Microsoft.AspNetCore.Server.Kestrel.Core.Tests.Http2StreamTests.ResetAfterCompleteAsync_POSTWithResponseBodyAndTrailers_RequestBodyThrows Failed Http2StreamTests 8 ms Assert.Contains() Failure Not found: (filter expression) In value: ConcurrentQueueI'm not sure, but I think this could be the root cause for other Http2Connection tests timeouts like the one described in #41172. It seems possible that deadlocks might happen on only on background threads in cases where we see Task timeouts instead of the entire helix work item hanging.
Stacktrace
Stack Trace 1
```text OS Thread Id: 0x7337 Child SP IP Call Site 00007EF54291E490 00007f369e13d376 [HelperMethodFrame_1OBJ: 00007ef54291e490] System.Threading.Monitor.ReliableEnter(System.Object, Boolean ByRef) 00007EF54291E5E0 00007F3626277DDB Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http2.Http2OutputProducer.Stop() 00007EF54291E610 00007F3626060F40 Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http2.Http2FrameWriter.AbortConnectionFlowControl() 00007EF54291E680 00007F3626060CD2 Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http2.Http2FrameWriter.Complete() 00007EF54291E6B0 00007F362627758D Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http2.Http2FrameWriter.Abort(Microsoft.AspNetCore.Connections.ConnectionAbortedException) 00007EF54291E6F0 00007F36262774D9 Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http2.Http2Connection.Abort(Microsoft.AspNetCore.Connections.ConnectionAbortedException) 00007EF54291E740 00007F362635F9E7 Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http2.Http2Connection.HandleReadDataRateTimeout() 00007EF54291E780 00007F3626D4075A Microsoft.AspNetCore.Server.Kestrel.Core.Tests.Http2ConnectionTests+Stack Trace 2
```text OS Thread Id: 0x559e Child SP IP Call Site 00007EF53AFEF550 00007f369e13d376 [HelperMethodFrame_1OBJ: 00007ef53afef550] System.Threading.Monitor.ReliableEnter(System.Object, Boolean ByRef) 00007EF53AFEF6A0 00007F362607DCC5 Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http2.Http2FrameWriter.CheckConnectionWindow(Int64) 00007EF53AFEF6E0 00007F3625EADDDE Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http2.Http2FrameWriter+Logs
Build
https://dev.azure.com/dnceng-public/public/_build/results?buildId=43548&view=results https://dev.azure.com/dnceng-public/public/_build/results?buildId=43548&view=ms.vss-test-web.build-test-results-tab&runId=885050&resultId=122079&paneView=dotnet-dnceng.dnceng-build-release-tasks.helix-test-information-tab