Open sksk571 opened 1 year ago
@sksk571 this seems similar to issue #422 on the root cause. I can suggest to use MaxPoolsize of 20 or limiting available threads on the application.
Correction: Try increasing the minimum available threads.
What I find strange in this scenario is why SNITCPHandle uses TryConnectParallel even when server hostname resolves into just one IP. Can synchronous Connect be used in this case?
What I find strange in this scenario is why SNITCPHandle uses TryConnectParallel even when server hostname resolves into just one IP. Can synchronous Connect be used in this case?
Your connection string is specifying MultiSubnetFailover=True yet also specifying a single IP address (Server=tcp:127.0.0.1,1433). Since there aren't multiple IPs to try, MultiSubnetFailover isn't needed. Turn it off so that the TryConnectParallel path isn't used.
This is just an example to reproduce the bug. In production we use multi subnet failover as a part of our DR strategy and the server hostname resolves into multiple IPs there.
@JRahnama increasing MinThreads didn't work in a linked issue why do you suggest to try it here? As I understand, increasing MinThreads has a negative performance impact because threads are created more often.
I believe we're running into an issue similar to this and our usage of MultiSubnetFailover with a server hostname that resolves to multiple IPs matches what @sksk571 does. Seems to only happen on Linux as well.
Are there other options to resolve this issue?
Message: Unobserved task exception
Exception: System.AggregateException: A Task's exception(s) were not observed either by Waiting on the Task or accessing its Exception property. As a result, the unobserved exception was rethrown by the finalizer thread. (Connection timed out) ---> System.Net.Sockets.SocketException (110): Connection timed out
at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token)
at System.Threading.Tasks.ValueTask.ValueTaskSourceAsTask.<>c.<.cctor>b__4_0(Object state)
--- End of stack trace from previous location ---
at System.Data.SqlClient.SNI.SNITCPHandle.ParallelConnectHelper(Socket socket, Task connectTask, TaskCompletionSource`1 tcs, StrongBox`1 pendingCompleteCount, StrongBox`1 lastError, List`1 sockets)
--- End of inner exception stack trace ---
Describe the bug
SqlConnection.Open gets stuck in SNITCPHandle.TryConnectParallel and times out when there is no available threads in thread pool.
Exception message:
Stack trace:
We have an ASP.NET HTTP application running in Alpine Linux in K8S. The application makes synchronous SQL requests to the database. During the peak load we sometimes experience a cascade of connection failures.
Dump files collected during the incident contain a number of threads waiting in SNITCPHandle.TryConnectParallel. This method uses sync-over-async to connect to multiple IP addresses in parallel. This, coupled with the thread pool exhaustion caused by a big amount of incoming requests, may be the reason for timeouts.
Stack trace from the dump file:
To reproduce
The following code reproduces the issue in Ubuntu 22.04 running in WSL. MultiSubnetFailover=True switches SNITCPHandle to use TryConnectParallel and triggers the bug.
Expected behavior
Synchronous SqlConnection.Open should be able to connect to SQL server regardless of the current ThreadPool usage.
Further technical details
Microsoft.Data.SqlClient version: 5.1.1 .NET target: .NET6 SQL Server version: SQL Server 2022 Operating system: Alpine 3.18 in a Docker container
Additional context SQL server for the repro case was installed in Docker using the following command