dotnet / SqlClient

Microsoft.Data.SqlClient provides database connectivity to SQL Server for .NET applications.
MIT License
854 stars 286 forks source link

Manual Test CancelAsyncConnections intermittently fails in worrying ways. #1255

Open Wraith2 opened 3 years ago

Wraith2 commented 3 years ago

Describe the bug

Running manual tests locally I've been seeing intermittent failures on an unusual test. I was seeing it on an experimental branch but has now reproduced it on clean main branch.

I've seen this fail, be incredibly slow (30 mins+) or fail with an access violation from the native sni. This is the most likely failure mode.

To run tests I do a full artifacts clean, build and then run tests.

[xUnit.net 00:08:43.51]          ---> System.Component
  Failed Microsoft.Data.SqlClient.ManualTesting.Tests.AsyncCancelledConnectionsTest.CancelAsyncConnections [38 s]
  Error Message:
   Assert.Empty() Failure
Collection: ["System.InvalidOperationException: Internal connect"..., "Microsoft.Data.SqlClient.SqlException (0x80131904)"..., "Microsoft.Data.SqlClient.SqlException (0x80131904)"..., "Microsoft.Data.SqlClient.SqlException (0x80131904)"..., "Microsoft.Data.SqlClient.SqlException (0x80131904)"...]
  Stack Trace:
     at Microsoft.Data.SqlClient.ManualTesting.Tests.AsyncCancelledConnectionsTest.RunCancelAsyncConnections(SqlConnectionStringBuilder connectionStringBuilder) in E:\Programming\csharp7\SqlClient\src\Microsoft.Data.SqlClient\tests\ManualTests\SQL\AsyncTest\AsyncCancelledConnectionsTest.cs:line 0
   at Microsoft.Data.SqlClient.ManualTesting.Tests.AsyncCancelledConnectionsTest.CancelAsyncConnections() in E:\Programming\csharp7\SqlClient\src\Microsoft.Data.SqlClient\tests\ManualTests\SQL\AsyncTest\AsyncCancelledConnectionsTest.cs:line 32
  Standard Output Messages:
 00:00:05.0020806 True Started:98 Done:85 InFlight:13 RowsRead:197598 ResultRead:1662 PoisonedEnded:40 nonPoisonedExceptions:469 PoisonedCleanupExceptions:0 Count:4 Found:0
 00:00:10.0005771 True Started:100 Done:98 InFlight:2 RowsRead:263405 ResultRead:2215 PoisonedEnded:53 nonPoisonedExceptions:469 PoisonedCleanupExceptions:0 Count:4 Found:0
 00:00:15.0012357 True Started:100 Done:98 InFlight:2 RowsRead:263405 ResultRead:2215 PoisonedEnded:53 nonPoisonedExceptions:469 PoisonedCleanupExceptions:0 Count:4 Found:0
 00:00:20.0051765 True Started:100 Done:98 InFlight:2 RowsRead:263405 ResultRead:2215 PoisonedEnded:53 nonPoisonedExceptions:469 PoisonedCleanupExceptions:0 Count:4 Found:0
 00:00:25.0089728 True Started:100 Done:98 InFlight:2 RowsRead:263405 ResultRead:2215 PoisonedEnded:53 nonPoisonedExceptions:469 PoisonedCleanupExceptions:0 Count:4 Found:0
 00:00:30.0035748 True Started:100 Done:98 InFlight:2 RowsRead:263405 ResultRead:2215 PoisonedEnded:53 nonPoisonedExceptions:469 PoisonedCleanupExceptions:0 Count:4 Found:0
 00:00:34.9992757 True Started:100 Done:98 InFlight:2 RowsRead:263405 ResultRead:2215 PoisonedEnded:53 nonPoisonedExceptions:469 PoisonedCleanupExceptions:0 Count:4 Found:0
 00:00:38.6946916 True Started:100 Done:100 InFlight:0 RowsRead:269593 ResultRead:2267 PoisonedEnded:53 nonPoisonedExceptions:471 PoisonedCleanupExceptions:0 Count:5 Found:0
 System.InvalidOperationException: Internal connection fatal error.
    at Microsoft.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySi
 Microsoft.Data.SqlClient.SqlException (0x80131904): A transport-level error has occurred when receiving results from the server. (provider: SSL Provider, error: 0 - The specified data could not be dec
 Microsoft.Data.SqlClient.SqlException (0x80131904): A connection was successfully established with the server, but then an error occurred during the login process. (provider: TCP Provider, error: 0 -
 Microsoft.Data.SqlClient.SqlException (0x80131904): A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 - The file name is too long.)
  --->
 Microsoft.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.
  ---> System.Component

To reproduce

Run the manual test suite run or possibly just the Microsoft.Data.SqlClient.ManualTesting.Tests.AsyncCancelledConnectionsTest.RunCancelAsyncConnections repeatedly and it will eventually fail.

Expected behavior

The test should reliably pass. It having intermittent failures and especially one as serious as an access violation is worrying for reliability.

Further technical details

Microsoft.Data.SqlClient version: (4.0.0 main) .NET target: netcore 3.1 SQL Server version: SQL Server 2017 Operating system: Windows 10, native SNI

JRahnama commented 3 years ago

@Wraith2, this is happening intermittently, I tried, but was not able to see the error. I'll try this again tonight and will get back to you.

MichelZ commented 1 week ago

Could this get resolved with #2714 ?

Wraith2 commented 1 week ago

Possibly. If I've been tricked into fixing a bad test I'm going to be irritated.