Open alex-jitbit opened 4 months ago
@alex-jitbit is this happening on a regular connection? I mean there is no AAD included? can you provide a sample repro please?
Linux uses managed SNI and I think the improvements were done mostly on the native side, which is windows only. Which change did you mean?
No, no AAD, my connection string uses explicit username/password combo
Data Source=172.0.0.123,1433;Initial Catalog=database;user id=user;pwd=PaSsWoRd;Max Pool Size=250;Encrypt=false
A simple repro would be:
var cn = new SqlConnection(connectionString);
cn.Open();
Compile on .NET 8, run on Ubuntu 22.04 (AWS) connecting to external SQL Server (also Ubuntu 22 on AWS).
Reverting to 5.1.5 fixed the problem immediately.
P.S. Can't repro on WSL Ubuntu connecting to Windows-hosted MS SQL Server, I assume the issue is with connecting to a linux-hosted SQL Server OR it happens under heavy load only.
@alex-jitbit I will test it today and will update you after.
I had the same problem, I downgrade to 5.1.5 and it worked again
SQL 2019 - Windows Server - .NET 8
I was not able to repro the issue on Ubuntu 22.04 as a local server, but I will test it with a remote server. If there is any issue it should be related to https://github.com/dotnet/SqlClient/pull/1029
Update: I tested with an azure SQL server at East US (adding more latency), but was not able to repro the issue.
here is my test setup:
csproj:
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>net8.0</TargetFramework>
<ImplicitUsings>enable</ImplicitUsings>
<Nullable>enable</Nullable>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Microsoft.Data.SqlClient" Version="5.2.0" />
</ItemGroup>
</Project>
Program.cs
using Microsoft.Data.SqlClient;
SqlConnectionStringBuilder builder = new(){
DataSource ="*******.database.windows.net",
UserID = "******",
Password = "*****",
InitialCatalog = "Northwind",
MaxPoolSize = 250
};
using SqlConnection conn = new(builder.ConnectionString);
conn.Open();
Console.WriteLine(conn.State);
I will test with a remote on premises server later today.
Also seeing the same. Downgrading to 5.2.0-preview5.24024.3
resolved the issue for me.
Additional debug info if it's helpful.
Framework: .NET 8.0.2
Runtime: linux-musl-x64
Image: Alpine Linux v3.19
Using: Microsoft.EntityFrameworkCore.SqlServer:8.0.2
Connected using an Azure SQL Failover group, via Entity Framework.
If you are on Linux/macOS and specify both port and instance name in the connection string (like server,12345\instance), that might be the source issue. There appears to have been a regression in 5.2.0 on non-Windows where it isn't ignoring the instance name when both it and the port are specified.
Some obervation from me hoping it helps investigation:
We started getting this problem in alpine for a test that starts multiple threads connecting some same database in parallel. Other tests work fine. Our connection strings do not specify instance or port. Tried adding some delays in the code inside the different tasks to affect timing and then we got a different error instead of the one mentioned in this ticket:
System.InvalidOperationException: Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached. 12:25:02 at Microsoft.Data.ProviderBase.DbConnectionFactory.TryGetConnection(DbConnection owningConnection, TaskCompletionSource`1 retry, DbConnectionOptions userOptions, DbConnectionInternal oldConnection, DbConnectionInternal& connection)
Here's how threads are created in the test: var tasks = schedulers.Select(s => new TaskFactory().StartNew(s.Start)).ToList(); foreach (var t in tasks) t.Wait();
Reverting back to v5.1.5 fixed this, so we are not updating to v5.2.0 until we know more.
I can confirm that we too experience this error under a heavy load with multiple threads (not sure if this is the culprit)
can you guys test with this package and see if the issue is resolved? just change the extension to nupkg
and should be good for testing.
Microsoft.Data.SqlClient.6.0.0-pull.106802.zip
@JRahnama Any chance you could add it to nuget.org so our build/test system can find it?
@sturledahl this package is not officially signed and is not suitable for production use. I just wanted to confirm that the fix has resolved the issue for users before proceeding with a hotfix release.
any update on this?
any update on this?
Were you able to test with the sample package?
can you guys test with this package and see if the issue is resolved? just change the extension to
nupkg
and should be good for testing. Microsoft.Data.SqlClient.6.0.0-pull.106802.zip
@JRahnama well no. We haven't updated to 5.2 yet because of this issue. We are currently using version 5.1.5 and running into https://github.com/dotnet/SqlClient/issues/449, but it is only happening in our production environment (with a lot of traffic) and even there only once every few weeks. We have no controlled environment to test this. Maybe @alex-jitbit has a way to reproduce it and see if the 6.0.0 version resolves it
Same issue on Windows!
Maybe @alex-jitbit has a way to reproduce it and see if the 6.0.0 version resolves it
Unfortunately this bug reproducible in production only (under high load) and frankly I'm too afraid to try beta fixes on my prod.
@alex-jitbit is it possible to test with 5.2.0-preview2 and 5.2.0-preview5 versions to identify what changed caused the issue?
We have same issue as people above after upgrading to the 5.2.0. Issue is reproducible from either Alpine containers or VM with Amazon Linux 2023. SQL Server is running on Windows Server and connection string contains named instance and port.
@JRahnama I've tested with different versions and here's the outcome:
Microsoft.Data.SqlClient.SqlException: A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct
and that SQL Server is configured to allow remote connections. (provider: TCP Provider, error: 26 - Error Locating Server/Instance Specified)
System.Net.Sockets.SocketException: Success
at int Microsoft.Data.SqlClient.SNI.SSRP.GetPortByInstanceName(string browserHostName, string instanceName, TimeoutTimer timeout, bool allIPsInParallel, SqlConnectionIPAddressPreference ipPreference)
at SNITCPHandle Microsoft.Data.SqlClient.SNI.SNIProxy.CreateTcpHandle(DataSource details, TimeoutTimer timeout, bool parallel, SqlConnectionIPAddressPreference ipPreference, string cachedFQDN, ref SQLDNSInfo pendingDNSInfo,
bool tlsFirst, string hostNameInCertificate, string serverCertificateFilename)
at void Microsoft.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, bool breakConnection, Action<Action> wrapCloseInAction)
at void Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, SqlCommand command, bool callerHasConnectionLock, bool asyncClose)
at void Microsoft.Data.SqlClient.TdsParser.Connect(ServerInfo serverInfo, SqlInternalConnectionTds connHandler, TimeoutTimer timeout, SqlConnectionString connectionOptions, bool withFailover)
at void Microsoft.Data.SqlClient.SqlInternalConnectionTds.AttemptOneLogin(ServerInfo serverInfo, string newPassword, SecureString newSecurePassword, TimeoutTimer timeout, bool withFailover)
at void Microsoft.Data.SqlClient.SqlInternalConnectionTds.LoginNoFailover(ServerInfo serverInfo, string newPassword, SecureString newSecurePassword, bool redirectedUserInstance, SqlConnectionString connectionOptions,
SqlCredential credential, TimeoutTimer timeout)
at void Microsoft.Data.SqlClient.SqlInternalConnectionTds.OpenLoginEnlist(TimeoutTimer timeout, SqlConnectionString connectionOptions, SqlCredential credential, string newPassword, SecureString newSecurePassword, bool
redirectedUserInstance)
at Microsoft.Data.SqlClient.SqlInternalConnectionTds..ctor(DbConnectionPoolIdentity identity, SqlConnectionString connectionOptions, SqlCredential credential, object providerInfo, string newPassword, SecureString
newSecurePassword, bool redirectedUserInstance, SqlConnectionString userConnectionOptions, SessionData reconnectSessionData, bool applyTransientFaultHandling, string accessToken, DbConnectionPool pool,
Func<SqlAuthenticationParameters, CancellationToken, Task<SqlAuthenticationToken>> accessTokenCallback)
at DbConnectionInternal Microsoft.Data.SqlClient.SqlConnectionFactory.CreateConnection(DbConnectionOptions options, DbConnectionPoolKey poolKey, object poolGroupProviderInfo, DbConnectionPool pool, DbConnection owningConnection,
DbConnectionOptions userOptions)
at DbConnectionInternal Microsoft.Data.ProviderBase.DbConnectionFactory.CreatePooledConnection(DbConnectionPool pool, DbConnection owningObject, DbConnectionOptions options, DbConnectionPoolKey poolKey, DbConnectionOptions
userOptions)
at DbConnectionInternal Microsoft.Data.ProviderBase.DbConnectionPool.CreateObject(DbConnection owningObject, DbConnectionOptions userOptions, DbConnectionInternal oldConnection)
at DbConnectionInternal Microsoft.Data.ProviderBase.DbConnectionPool.UserCreateRequest(DbConnection owningObject, DbConnectionOptions userOptions, DbConnectionInternal oldConnection)
at bool Microsoft.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, uint waitForMultipleObjectsTimeout, bool allowCreate, bool onlyOneCheckConnection, DbConnectionOptions userOptions, out
DbConnectionInternal connection)
at bool Microsoft.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, TaskCompletionSource<DbConnectionInternal> retry, DbConnectionOptions userOptions, out DbConnectionInternal connection)
at bool Microsoft.Data.ProviderBase.DbConnectionFactory.TryGetConnection(DbConnection owningConnection, TaskCompletionSource<DbConnectionInternal> retry, DbConnectionOptions userOptions, DbConnectionInternal oldConnection, out
DbConnectionInternal connection)
at bool Microsoft.Data.ProviderBase.DbConnectionInternal.TryOpenConnectionInternal(DbConnection outerConnection, DbConnectionFactory connectionFactory, TaskCompletionSource<DbConnectionInternal> retry, DbConnectionOptions
userOptions)
at bool Microsoft.Data.ProviderBase.DbConnectionClosed.TryOpenConnection(DbConnection outerConnection, DbConnectionFactory connectionFactory, TaskCompletionSource<DbConnectionInternal> retry, DbConnectionOptions userOptions)
at bool Microsoft.Data.SqlClient.SqlConnection.TryOpen(TaskCompletionSource<DbConnectionInternal> retry, SqlConnectionOverrides overrides)
at void Microsoft.Data.SqlClient.SqlConnection.Open(SqlConnectionOverrides overrides)
at void Microsoft.Data.SqlClient.SqlConnection.Open()
at void Microsoft.EntityFrameworkCore.SqlServer.Storage.Internal.SqlServerConnection.OpenDbConnection(bool errorsExpected)
at void Microsoft.EntityFrameworkCore.Storage.RelationalConnection.OpenInternal(bool errorsExpected)
at bool Microsoft.EntityFrameworkCore.Storage.RelationalConnection.Open(bool errorsExpected)
at bool Microsoft.EntityFrameworkCore.RelationalDatabaseFacadeExtensions.<>c.<OpenConnection>b__22_0(DatabaseFacade database)
at TResult Microsoft.EntityFrameworkCore.ExecutionStrategyExtensions.<>c__DisplayClass12_0`2.<Execute>b__0(DbContext _, TState s)
at TResult Microsoft.EntityFrameworkCore.SqlServer.Storage.Internal.SqlServerExecutionStrategy.Execute<TState,TResult>(TState state, Func<DbContext, TState, TResult> operation, Func<DbContext, TState, ExecutionResult<TResult>>
verifySucceeded)
at TResult Microsoft.EntityFrameworkCore.ExecutionStrategyExtensions.Execute<TState,TResult>(IExecutionStrategy strategy, TState state, Func<TState, TResult> operation, Func<TState, ExecutionResult<TResult>> verifySucceeded)
at void Microsoft.EntityFrameworkCore.RelationalDatabaseFacadeExtensions.OpenConnection(DatabaseFacade databaseFacade)
at int TestConnectionCommand.Execute(CommandContext context)
@ABAG603 Fix is merged in the main branch by PR #2395. Hotfix release v5.2.1 is planned, but date yet TBD.
Closing the issue as fix is merged and will be available by next hotfix release.
Unfortunatly, I couldn't wait till release, But I found out that, one other cause of this error can be due to the compatibility level of your sql-server. For example Linq-Queries with a collection filter like so query.Where(p => (new List<string> { "XX","YY"})).Contains(p.MyCode));
resulted into an SQL like:
SELECT [t].[Id]
WHERE [t].[MyCode] IN (
SELECT [c].[value]
FROM OPENJSON(@__codes_0) WITH ([value] nvarchar(50) '$') AS [c]
)
ORDER BY [t].[Id]
```.
Then I had to raise my Database Compatibility level to "150" in order to run the "OPENJSON"-function. But I assume it should also work under level "130".
@JRahnama any update on a release date for 5.2.1 yet? It's been a couple of weeks now and still no hotfix
NOT FIXED attn @JRahnama
The issue is not fixed in 5.2.1
I'm getting the same error under high load:
Microsoft.Data.SqlClient.SqlException (0x80131904): A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: TCP Provider, error: 35 - An internal exception was caught)
---> System.TimeoutException: The socket couldn't connect during the expected 14999 remaining time.
at Microsoft.Data.SqlClient.SNI.SNITCPHandle.Connect(String serverName, Int32 port, TimeoutTimer timeout, SqlConnectionIPAddressPreference ipPreference, String cachedFQDN, SQLDNSInfo& pendingDNSInfo)
at Microsoft.Data.SqlClient.SNI.SNITCPHandle..ctor(String serverName, Int32 port, TimeoutTimer timeout, Boolean parallel, SqlConnectionIPAddressPreference ipPreference, String cachedFQDN, SQLDNSInfo& pendingDNSInfo, Boolean tlsFirst, String hostNameInCertificate, String serverCertificateFilename)
at Microsoft.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, UInt32 waitForMultipleObjectsTimeout, Boolean allowCreate, Boolean onlyOneCheckConnection, DbConnectionOptions userOptions, DbConnectionInternal& connection)
at Microsoft.Data.ProviderBase.DbConnectionFactory.TryGetConnection(DbConnection owningConnection, TaskCompletionSource`1 retry, DbConnectionOptions userOptions, DbConnectionInternal oldConnection, DbConnectionInternal& connection)
at Microsoft.Data.ProviderBase.DbConnectionInternal.TryOpenConnectionInternal(DbConnection outerConnection, DbConnectionFactory connectionFactory, TaskCompletionSource`1 retry, DbConnectionOptions userOptions)
at Microsoft.Data.SqlClient.SqlConnection.TryOpen(TaskCompletionSource`1 retry, SqlConnectionOverrides overrides)
at Microsoft.Data.SqlClient.SqlConnection.Open(SqlConnectionOverrides overrides)
Reverting to 5.1.5 fixes the issue
Please reopen the issue
@alex-jitbit Can you confirm that the issue is not happening with 5.2.0-preview5.24024.3? I saw some other users have confirmed it and that was regarding a regression we addressed in 5.2.1. Seems like in your case is a bit different.
Also we are going to need a simple repro application for further investigation.
@alex-jitbit Does increasing the connect timeout in your connection string solve the issue? The reason I ask is that #2098 in 5.2-preview improved respecting of connection timeout during connect on the exact path your exception is occurring and the message indicates the default timeout of 15 seconds has elapsed when the exception is thrown. I'm wondering if 5.1 was simply taking longer than the connect timeout to connect under load but succeeding anyway.
@JRahnama repro here: https://gist.github.com/alex-jitbit/1eca9a1f014e036691bdc35cd852c726 the bug is even reproducable when running on Windows under WSL2 however the error is slightly different in that case (see repro description)
I can confirm that this is indeed broken in 5.2.1 as well, and actually worse than 5.2.0. This is trivial to reproduce, basically do exactly what @alex-jitbit -- just open a bunch of connections to simulate what a real application experiencing traffic would do.
In 5.2.0, SqlClient crashes immediately with that same. When we use 5.2.1, there seems to be some kind of timeout mechanism that makes the cashes take forever, and effectively freezes up the whole application.
Stack trace seems to be the same:
Microsoft.Data.SqlClient.SqlException (0x80131904): A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: TCP Provider, error: 35 - An internal exception was caught)
---> System.TimeoutException: The socket couldn't connect during the expected 14990 remaining time.
at Microsoft.Data.SqlClient.SNI.SNITCPHandle.Connect(String serverName, Int32 port, TimeoutTimer timeout, SqlConnectionIPAddressPreference ipPreference, String cachedFQDN, SQLDNSInfo& pendingDNSInfo)
at Microsoft.Data.SqlClient.SNI.SNITCPHandle..ctor(String serverName, Int32 port, TimeoutTimer timeout, Boolean parallel, SqlConnectionIPAddressPreference ipPreference, String cachedFQDN, SQLDNSInfo& pendingDNSInfo, Boolean tlsFirst, String hostNameInCertificate, String serverCertificateFilename)
at Microsoft.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, UInt32 waitForMultipleObjectsTimeout, Boolean allowCreate, Boolean onlyOneCheckConnection, DbConnectionOptions userOptions, DbConnectionInternal& connection)
at Microsoft.Data.ProviderBase.DbConnectionFactory.TryGetConnection(DbConnection owningConnection, TaskCompletionSource`1 retry, DbConnectionOptions userOptions, DbConnectionInternal oldConnection, DbConnectionInternal& connection)
at Microsoft.Data.ProviderBase.DbConnectionInternal.TryOpenConnectionInternal(DbConnection outerConnection, DbConnectionFactory connectionFactory, TaskCompletionSource`1 retry, DbConnectionOptions userOptions)
at Microsoft.Data.SqlClient.SqlConnection.TryOpen(TaskCompletionSource`1 retry, SqlConnectionOverrides overrides)
at Microsoft.Data.SqlClient.SqlConnection.InternalOpenAsync(CancellationToken cancellationToken)
After upgrading 5.1.5 to 5.2.0 on Linux (Ubuntu) .NET 8 I'm getting thousands of errors:
Some requests work fine, but about 50% throw this error. Reverting back to 5.1.5 solves the problem.
Further technical details
Microsoft.Data.SqlClient version: 5.2 .NET target: NET 8 SQL Server version: SQL 2017 on Linux Operating system: Ubuntu 22