dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.12k stars 4.7k forks source link

[Regression][Perf][Linux] TransactionScope and Connection pool problems on Linux #25263

Closed Ghyath-Serhal closed 4 years ago

Ghyath-Serhal commented 6 years ago

I have a business application that uses TransactionScope. The application runs on Windows and we are migrating it to dotnet core in order to work Linux.

I have 2 problems on Linux environment: 1- TransactionScope is not supported 2- The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached.

We tried to use dotnet core 2.0 and system.Data.SqlClient Version 4.5.0-preview1-26216-02. The Connection pool problem is solved but the TransactionScope problem is not solved. (MSDTC problems, this platform does not support Distributed Transaction)

We tried to use dotnet core 2.1 and System.Data.SqlClient Version 4.5.0-preview2-26228-08. The TransactionScope is solved but the connection pool problem is not solved.

Thank you for your support.

jimcarley commented 6 years ago

Ghyath-Serhal,

When you say "TransactionScope is not supported", you are referring to the fact that SqlClient does not utilize Transaction.Current, correct?

Also, I am not sure what you mean by the "connection pool problem". When you say that System.Data.SqlClient Version 4.5.0-preview1-26216-02 solved the connection pool problem, do you mean that you are no longer seeing timeouts, even when the all the pooled connections were in use and the max pool size was reached? Or are pooled connections being "released" when they weren't before?

Both of these issues are really with System.Data, not System.Transactions. I have added the tag for area-System.Data.SqlClient.

Jim

saurabh500 commented 6 years ago

What is the connection pool problem? Promotion to MSDTC is not supported by Sys.Transactions, so SqlClient cannot promote either.

Can you post a repro for the connection pool problem?

jimcarley commented 6 years ago

It is definitely true that transaction promotion to MSDTC is not supported in .NET core. However, it is my understanding that there was a time where SqlClient did not utilize Transaction.Current, leading to the "TransactionScope is not supported" issue. I made the assumption that this is what was being referenced because Ghyath-Serhal indicated that "dotnet core 2.0 and system.Data.SqlClient Version 4.5.0-preview1-26216-02" solved the TransactionScope problem.

If the "TransactionScope problem" really is that distributed/promoted transactions are not supported, then no version is yet available to solve that problem.

atiato commented 6 years ago

Hi, I will support as well in this as well: Transaction scope problem (As per my understanding it is not supported but the version -dotnet core 2.1 and System.Data.SqlClient Version 4.5.0-preview2-26228-08 don't show below exception).

Time: 10:17:38.6822967 ------- Message: MSDTC on server 'GHAYATHSERHAL' is unavailable. ------- Inner Exception: ---------- Stack Trace: at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction) at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady) at System.Data.SqlClient.SqlDataReader.TryConsumeMetaData() at System.Data.SqlClient.SqlDataReader.get_MetaData() at System.Data.SqlClient.TdsParser.TdsExecuteTransactionManagerRequest(Byte[] buffer, TransactionManagerRequestType request, String transactionName, TransactionManagerIsolationLevel isoLevel, Int32 timeout, SqlInternalTransaction transaction, TdsParserStateObject stateObj, Boolean isDelegateControlRequest) at System.Data.SqlClient.TdsParser.GetDTCAddress(Int32 timeout, TdsParserStateObject stateObj) at System.Data.SqlClient.SqlInternalConnectionTds.GetDTCAddress() at System.Data.SqlClient.SqlInternalConnection.EnlistNonNull(Transaction tx) at System.Data.ProviderBase.DbConnectionPool.PrepareConnection(DbConnection owningObject, DbConnectionInternal obj, Transaction transaction) at System.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, UInt32 waitForMultipleObjectsTimeout, Boolean allowCreate, Boolean onlyOneCheckConnection, DbConnectionOptions userOptions, DbConnectionInternal& connection) at System.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, TaskCompletionSource1 retry, DbConnectionOptions userOptions, DbConnectionInternal& connection) at System.Data.ProviderBase.DbConnectionFactory.TryGetConnection(DbConnection owningConnection, TaskCompletionSource`1 retry, DbConnectionOptions userOptions, DbConnectionInternal oldConnection, DbConnectionInternal& connection)

for pooling we have WebAPI that open and close connection to MSSQL , then we execute performance testing on that WebAPI while monitoring the DB , it shows that it opens only one connection and raise the following error after twenty hit to WebAPI:

Message: Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections we re in use and max pool size was reached. ------- Inner Exception: ---------- Stack Trace: at System.Data.ProviderBase.DbConnectionFactory.TryGetConnection(DbConnection owningConnection, TaskCompletionSource1 retry, DbConnectionOptions userOptions, DbConnectionInternal oldConnection, DbConnectionInternal& connection) at System.Data.ProviderBase.DbConnectionInternal.TryOpenConnectionInternal(DbConnection outerConnection, DbConnectionFactory connectionFactory, TaskCompletionSource1 retry, DbConnection Options userOptions) at System.Data.SqlClient.SqlConnection.TryOpen(TaskCompletionSource`1 retry)

Note that the correct behavior of the SQL Client is to open connection per instance of the Web API . the version of the SQL Client 4.5.0-preview1-26216-02 is working fine and we didn't see that pooling exception and it opens a connection per instance of the Application.

thanks,

saurabh500 commented 6 years ago

SqlClient support TransactionScope in the 4.5.0-preview* versions. It was not supported in 4.4.0 which came out the same time as .Net Core 2.0

@atiato

As per my understanding it is not supported but the version -dotnet core 2.1 and System.Data.SqlClient Version 4.5.0-preview2-26228-08 don't show below exception)

TransactionScope for SqlClient is supported, but it cannot transaction promotion to MSDTC. Are you opening multiple connections in the same transaction scope in the repro that you have?

From what I understand you are not facing any problems with connection pool and while using transaction scope you are getting an exception related to MSDTC promotion. Is this correct?

atiato commented 6 years ago

@saurabh500 When we are using dotnet core 2.1 + System.Data.SqlClient Version 4.5.0-preview2-26228-08 , the transaction scope is fine but we are facing the pooling issue after performing a performance test.

while dotnet core 2.0 + SQL Client 4.5.0-preview1-26216-02 raise the MSDTC issue while pooling is working fine we didn't see above pooling exception.

all using same code.

if we are using same connection string to same DB in many places in the code while using transaction scope , do the SQL Client and Dotnet Core escalate it to MSDTC ?

saurabh500 commented 6 years ago

@atiato Can you give me a repro for Connection Pool issue? I would like to understand this better.

if we are using same connection string to same DB in many places in the code while using transaction scope , do the SQL Client and Dotnet Core escalate it to MSDTC ?

If you have multiple connections being opened to the same database under the same transaction scope, that is considered a promotion and will need to us MSDTC and hence cause a failure in .Net Core. Refer: https://docs.microsoft.com/en-us/dotnet/framework/data/adonet/system-transactions-integration-with-sql-server

Snippet

The benefit of using promotable transactions is that when a connection is opened by using an active TransactionScope transaction, and no other connections are opened, the transaction commits as a lightweight transaction, instead of incurring the additional overhead of a full distributed transaction.
atiato commented 6 years ago

when using

image

I got

image

the following code is used

    string connectionString = "Password=sasql1$$;Persist Security Info=True;User ID=sa;Initial Catalog=mea;Data Source=tcp:192.168.0.6";

    public void GetData()
    {
        TransactionOptions transOption = new TransactionOptions();
        transOption.IsolationLevel = System.Transactions.IsolationLevel.ReadUncommitted;

        using (TransactionScope scope = new TransactionScope(TransactionScopeOption.Required, transOption))
        {
            try
            {
                ReadData();
                InsertData();
                ReadData();
            }
            catch (Exception ex)
            {
                FileWriter.LogException(ex);
                scope.Dispose();
            }
            scope.Complete();
        }
    }

but when using :

image

we got :

image

saurabh500 commented 6 years ago

@atiato I am more interested in knowing what is going on inside

                ReadData();
                InsertData();
                ReadData();
saurabh500 commented 6 years ago

Basically I want to see the code from System.Data.SqlClient that is being executed.

atiato commented 6 years ago
public void ReadData()
{
    try
    {
        List<EmployeeWorkHour> lst = new List<EmployeeWorkHour>();

        using (SqlConnection conn = new SqlConnection(connectionString))
        {
            using (SqlCommand cmd = new SqlCommand("SELECT TOP 50 * FROM dbo.EmployeeWorkHour WHERE PersonID = '81d8e09f-ce98-45fa-a311-ca7d7deccb7d'", conn))
            {
                conn.Open();
                using (SqlDataReader reader = cmd.ExecuteReader())
                {
                    while (reader.Read())
                    {
                        EmployeeWorkHour employeeWorkHour = new EmployeeWorkHour();
                        employeeWorkHour.EmployeeWorkHourID = reader["EmployeeWorkHourID"].ToString();
                        employeeWorkHour.PersonID = reader["PersonID"].ToString();
                        employeeWorkHour.AttendanceTypeID = reader["AttendanceTypeID"].ToString();

                        lst.Add(employeeWorkHour);
                    }

                    FileWriter.WriteToFile("Success Read...");
                }
            }
            conn.Close();
        }
    }
    catch (Exception ex)
    {
        FileWriter.LogException(ex);
    }
}

public void InsertData()
{
    try
    {
        using (SqlConnection conn = new SqlConnection(connectionString))
        {
            using (SqlCommand cmd = new SqlCommand("insert into AccessGroup values(NEWID(), 'TestAccessGroup')", conn))
            {
                conn.Open();
                cmd.ExecuteNonQuery();
                conn.Close();

                FileWriter.WriteToFile("Read success...");
            }
        }
    }
    catch (Exception ex)
    {
        FileWriter.LogException(ex);
    }
}
saurabh500 commented 6 years ago

Do you have any Connection Pool related connection string parameters? Like Min Pool Size, Max Pool Size ? Can you provide your connection string without the server name, uid and pwd?

atiato commented 6 years ago

string connectionString = "Password=XXXXXXX;Persist Security Info=True;User ID=XXXX;Initial Catalog=mea;Data Source=tcp:XXXXXXXXX";

atiato commented 6 years ago

after 30 hits it raises the pool issue while SQL Client 4.5.0-preview1-26216-02 has no issue in that.

saurabh500 commented 6 years ago

Thanks for all the information. I will investigate the issue and get back to you.

atiato commented 6 years ago

Hi saurabh500 , were you able to reproduce the case ?

keeratsingh commented 6 years ago

Hi @atiato, I have been trying to repro the issue and I am able to hit the MSDTC issue with 4.5.0-preview1-26216-02. As expected MSDTC issue is not seen with 4.5.0-preview2-26228-08 as mentioned above.

However, I am not able to recreate the Connection Pooling issue. Could you kindly provide a github repo with the steps to repro the issue ? Given below is the link to the adhoc application I am using to repro the issue. adhoc application

Regards, Keerat

atiato commented 6 years ago

Hi keeratsingh ,

use the following github to reproduce .

https://github.com/atiato/ola/tree/master/LinuxPerfomanceTest

make sure you use the correct SQL Client 4.5.0-preview2-26228-08 that causes the pooling issue :

image

Try to hit the Web API (LinuxPerformaceTest from github) with more than 40 hits asynch and pooling issue will appear .

it must create SQL connection for every instance of the application otherwise will fail with pooling issue .

thanks for your support, Omar Atia

keeratsingh commented 6 years ago

Thank you Omar for the repo, I will investigate further and get back to you.

atiato commented 6 years ago

Hi keeratsingh ! were you able to simulate pooling issue ?

keeratsingh commented 6 years ago

Hi Omar,

Thank you for providing the repo, given below are my observations and steps I took to try to simulate the issue.

To sum up I am unable to simulate the Connection Pooling issue after 20/30/40 calls during the first attempt after the LinuxPerformanceTest app is ran. Do you see the pooling issue on you first attempt or is it usually after a couple of runs ? If latter, then this might be related to dotnet/SqlClient#18 The pool reaches its max connection limit quickly because of async opens and parallel execution. However with Connection.Close() or Dispose() the connections are not making their way back into the pool quickly enough. As a result the connection pool hits a limit of max connections. Further investigation needed to understand what in Close() is preventing the connections to get back on the pool fast enough.

Since, I am not sure how you are calling the API in your environment, I would need more concrete repro steps to investigate and fix the issue.

Regards, Keerat

atiato commented 6 years ago

Hi keeratsingh,

on the following configuration 👍

dotnetcore 2.0 and

image image

it is working fine no pooling issue and transaction is ok.

👍

but if we use dotnetcore2.1 and image

the pooling issue appears again

looks like u used netcoreapp2.0 with sql Client Version="4.5.0-preview2-26228-08" that's why it is working with you ;)

Try to use it with netcoreapp2.1 with sqlClient Version="4.5.0-preview2-26228-08" and you will face pooling issue after 30 or 40 API hit.

Thanks, Omar Atia

keeratsingh commented 6 years ago

Hi Omar,

Thank you for the specifics, I was able to repro the issue consistently with netcoreapp2.1 on Linux, I will investigate further and get back to you with updates/questions.

Regards, Keerat

keeratsingh commented 6 years ago

Hi Omar,

While we are still investigating as to why this issue occurs in netcoreapp 2.1 and not in 2.0, could you kindly try setting these ENV variables and see if it resolves the issue for you. Kindly set the ENV variables from the terminal that the aspnetcoreapp(LinuxPerformanceTest) is run from.

export COMPlus_ThreadPool_ForceMinWorkerThreads=200; 
export COMPlus_ThreadPool_ForceMaxWorkerThreads=10000;

Regards, Keerat

keeratsingh commented 6 years ago

Hi @karelz ,

As already known the issue is seen on Linux with netcoreapp2.1 only and not with netcoreapp2.0 keeping the SqlClient same in both the cases. I added some traces in SqlClient to see the difference between the behavior for netcoreapp2.0 and netcoreapp2.1. As per my investigation this seems to be an sslstream regression, given below is the specific line of code, AuthenticateAsClient method call inside the EnableSsl method where the delay occurs in SqlClient. https://github.com/dotnet/corefx/blob/master/src/System.Data.SqlClient/src/System/Data/SqlClient/SNI/SNITcpHandle.cs#L359

I have attached the results for both netcoreapp2.0 and 2.1, as seen in the logs attached the EnableSsl method takes way longer on netcoreapp2.1. The time taken by EnableSsl on netcoreapp2.0 is between 40-80ms, however this time taken by EnableSsl rises 10x to values such as 4000-5000ms for netcoreapp2.1, this delay is causing the connections to timeout. Also seen in the logs attached the max pool size(100) is not being reached, hence it is not a case of connection pool being full.

Could you kindly ping the team responsible for the same.

Regards, Keerat

Profiling_netcoreapp21.log Profiling_netcoreapp20.log

keeratsingh commented 6 years ago

Hi @atiato Could you kindly try the following workaround and see if it resolves your issue ?

Regards, Keerat

karelz commented 6 years ago

@keeratsingh can you isolate the SslStream calls into standalone repro? That would simplify investigation greatly.

atiato commented 6 years ago

Hi @keeratsingh

Same issue .

Message: Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached. ------- Inner Exception: ---------- Stack Trace: at System.Data.ProviderBase.DbConnectionFactory.TryGetConnection(DbConnection owningConnection, TaskCompletionSource1 retry, DbConnectionOptions userOptions, DbConnectionInternal oldConnection, DbConnectionInternal& connection) at System.Data.ProviderBase.DbConnectionInternal.TryOpenConnectionInternal(DbConnection outerConnection, DbConnectionFactory connectionFactory, TaskCompletionSource1 retry, DbConnectionOptions userOptions)

keeratsingh commented 6 years ago

Hi @atiato ,

Which of the two suggestions did you try ? Did you try setting the thread pools in the aspnetcoreapp?

Regards, Keerat

atiato commented 6 years ago

it doesn't work when I did below image

but when added

export COMPlus_ThreadPool_ForceMinWorkerThreads=200; export COMPlus_ThreadPool_ForceMaxWorkerThreads=10000;

it worked.

keeratsingh commented 6 years ago

Moving issue out of 2.1.0, explanation below:

Tomas(SslStream) was fully booked by the time we asked for his help, so he didn’t have time to look at it until Friday. It is now his top priority.

karelz commented 6 years ago

Given the impact, I don't think we can punt it post 2.1 just yet. My comment about @wfurt meant that the ask for help came too late for it to be schedule before ZBB. We should IMO keep it in 2.1 and make an honest attempt to address it. It is a bad regression, affecting functionality at all.

wfurt commented 6 years ago

yes, I'm still on it.

stephentoub commented 6 years ago

It looks like there are two issues here:

  1. https://github.com/dotnet/corefx/issues/29029
  2. SqlClient is for some reason issuing synchronous calls on a Socket on which asynchronous calls were previously issued. That combined with (1) is very bad, but even without (1), that still makes the synchronous operations much more expensive. It should either use all sync operations or all async operations; even sync and then async is ok from this perspective, but async and then sync is slower.
saurabh500 commented 6 years ago

Re

SqlClient is for some reason issuing synchronous calls on a Socket on which asynchronous calls were previously issued. That combined with (1) is very bad, but even without (1), that still makes the synchronous operations much more expensive. It should either use all sync operations or all async operations; even sync and then async is ok from this perspective, but async and then sync is slower.

@stephentoub do you have the call stack showing this behavior? I can expect this to happen in a few cases because of lack of better APIs.

a. If a connection was opened and async operations are done on the connection, then the connection was Closed or Disposed. In Close() or Dispose() socket operations are sync where the client tries to see if there is any data on the network and tries to read that data. This should have very low impact because if the data has been read off the wire, then there are hardly any cases where Close()/Dispose() will cause a network read.

b. In case a connection was opened and used asynchronously, and it goes back to the connection pool after close and then reused to do sync operations.

If you have any call stacks that you saved during this investigation where you saw a switch from Async to Sync, that would greatly help. Also what repro have you been using? Is it the one on sqlubuntu1604 VM? We did a sweep of switch from async to sync in SqlClient and did make some improvements in those areas in 2.1. If there is something that slipped then it would be great to know about them and fix them.

stephentoub commented 6 years ago

@wfurt can likely provide more detail, but here's a stack trace he shared with me:

System.Private.CoreLib.dll!System.Threading.ManualResetEventSlim.Wait(int millisecondsTimeout, System.Threading.CancellationToken cancellationToken)               C#
System.Private.CoreLib.dll!System.Threading.ManualResetEventSlim.Wait(int millisecondsTimeout)         C#
System.Net.Sockets.dll!System.Net.Sockets.SocketAsyncContext.PerformSyncOperation<System.Net.Sockets.SocketAsyncContext.ReadOperation>(ref System.Net.Sockets.SocketAsyncContext.OperationQueue<System.Net.Sockets.SocketAsyncContext.ReadOperation> queue, System.Net.Sockets.SocketAsyncContext.ReadOperation operation, int timeout, int observedSequenceNumber) Line 1125    C#
System.Net.Sockets.dll!System.Net.Sockets.SocketAsyncContext.ReceiveFrom(System.Memory<byte> buffer, ref System.Net.Sockets.SocketFlags flags, byte[] socketAddress, ref int socketAddressLen, int timeout, out int bytesReceived) Line 1309              C#
System.Net.Sockets.dll!System.Net.Sockets.SocketAsyncContext.Receive(System.Memory<byte> buffer, ref System.Net.Sockets.SocketFlags flags, int timeout, out int bytesReceived) Line 1272          C#
System.Net.Sockets.dll!System.Net.Sockets.SocketPal.Receive(System.Net.Sockets.SafeCloseSocket handle, byte[] buffer, int offset, int count, System.Net.Sockets.SocketFlags socketFlags, out int bytesTransferred) Line 980                C#
System.Net.Sockets.dll!System.Net.Sockets.Socket.Receive(byte[] buffer, int offset, int size, System.Net.Sockets.SocketFlags socketFlags, out System.Net.Sockets.SocketError errorCode) Line 1433   C#
System.Net.Sockets.dll!System.Net.Sockets.Socket.Receive(byte[] buffer, int offset, int size, System.Net.Sockets.SocketFlags socketFlags) Line 1399   C#
System.Net.Sockets.dll!System.Net.Sockets.NetworkStream.Read(byte[] buffer, int offset, int size) Line 295                C#
System.Data.SqlClient.dll!System.Data.SqlClient.SNI.SNIPacket.ReadFromStream(System.IO.Stream stream) Line 298                C#
System.Data.SqlClient.dll!System.Data.SqlClient.SNI.SNITCPHandle.Receive(out System.Data.SqlClient.SNI.SNIPacket packet, int timeoutInMilliseconds) Line 490 C#
System.Data.SqlClient.dll!System.Data.SqlClient.SNI.SNIProxy.ReadSyncOverAsync(System.Data.SqlClient.SNI.SNIHandle handle, out System.Data.SqlClient.SNI.SNIPacket packet, int timeout) Line 216     C#
System.Data.SqlClient.dll!System.Data.SqlClient.SNI.TdsParserStateObjectManaged.ReadSyncOverAsync(int timeoutRemaining, out uint error) Line 136           C#
System.Data.SqlClient.dll!System.Data.SqlClient.TdsParserStateObject.ReadSniSyncOverAsync() Line 2082                C#
System.Data.SqlClient.dll!System.Data.SqlClient.TdsParserStateObject.TryReadNetworkPacket() Line 2038                C#
System.Data.SqlClient.dll!System.Data.SqlClient.TdsParser.ConsumePreLoginHandshake(bool encrypt, bool trustServerCert, bool integratedSecurity, out bool marsCapable) Line 660               C#
System.Data.SqlClient.dll!System.Data.SqlClient.TdsParser.Connect(System.Data.SqlClient.ServerInfo serverInfo, System.Data.SqlClient.SqlInternalConnectionTds connHandler, bool ignoreSniOpenTimeout, long timerExpire, bool encrypt, bool trustServerCert, bool integratedSecurity, bool withFailover) Line 369         C#
System.Data.SqlClient.dll!System.Data.SqlClient.SqlInternalConnectionTds.AttemptOneLogin(System.Data.SqlClient.ServerInfo serverInfo, string newPassword, System.Security.SecureString newSecurePassword, bool ignoreSniOpenTimeout, System.Data.ProviderBase.TimeoutTimer timeout, bool withFailover) Line 1615   C#
System.Data.SqlClient.dll!System.Data.SqlClient.SqlInternalConnectionTds.LoginNoFailover(System.Data.SqlClient.ServerInfo serverInfo, string newPassword, System.Security.SecureString newSecurePassword, bool redirectedUserInstance, System.Data.SqlClient.SqlConnectionString connectionOptions, System.Data.SqlClient.SqlCredential credential, System.Data.ProviderBase.TimeoutTimer timeout) Line 1304       C#
System.Data.SqlClient.dll!System.Data.SqlClient.SqlInternalConnectionTds.OpenLoginEnlist(System.Data.ProviderBase.TimeoutTimer timeout, System.Data.SqlClient.SqlConnectionString connectionOptions, System.Data.SqlClient.SqlCredential credential, string newPassword, System.Security.SecureString newSecurePassword, bool redirectedUserInstance) Line 1193       C#
System.Data.SqlClient.dll!System.Data.SqlClient.SqlInternalConnectionTds.SqlInternalConnectionTds(System.Data.ProviderBase.DbConnectionPoolIdentity identity, System.Data.SqlClient.SqlConnectionString connectionOptions, System.Data.SqlClient.SqlCredential credential, object providerInfo, string newPassword, System.Security.SecureString newSecurePassword, bool redirectedUserInstance, System.Data.SqlClient.SqlConnectionString userConnectionOptions, System.Data.SqlClient.SessionData reconnectSessionData, bool applyTransientFaultHandling) Line 369     C#
System.Data.SqlClient.dll!System.Data.SqlClient.SqlConnectionFactory.CreateConnection(System.Data.Common.DbConnectionOptions options, System.Data.Common.DbConnectionPoolKey poolKey, object poolGroupProviderInfo, System.Data.ProviderBase.DbConnectionPool pool, System.Data.Common.DbConnection owningConnection, System.Data.Common.DbConnectionOptions userOptions) Line 135          C#
System.Data.SqlClient.dll!System.Data.ProviderBase.DbConnectionFactory.CreateNonPooledConnection(System.Data.Common.DbConnection owningConnection, System.Data.ProviderBase.DbConnectionPoolGroup poolGroup, System.Data.Common.DbConnectionOptions userOptions) Line 96            C#
System.Data.SqlClient.dll!System.Data.ProviderBase.DbConnectionFactory.TryGetConnection(System.Data.Common.DbConnection owningConnection, System.Threading.Tasks.TaskCompletionSource<System.Data.ProviderBase.DbConnectionInternal> retry, System.Data.Common.DbConnectionOptions userOptions, System.Data.ProviderBase.DbConnectionInternal oldConnection, out System.Data.ProviderBase.DbConnectionInternal connection) Line 140            C#
System.Data.SqlClient.dll!System.Data.ProviderBase.DbConnectionInternal.TryOpenConnectionInternal(System.Data.Common.DbConnection outerConnection, System.Data.ProviderBase.DbConnectionFactory connectionFactory, System.Threading.Tasks.TaskCompletionSource<System.Data.ProviderBase.DbConnectionInternal> retry, System.Data.Common.DbConnectionOptions userOptions) Line 325          C#
System.Data.SqlClient.dll!System.Data.ProviderBase.DbConnectionClosed.TryOpenConnection(System.Data.Common.DbConnection outerConnection, System.Data.ProviderBase.DbConnectionFactory connectionFactory, System.Threading.Tasks.TaskCompletionSource<System.Data.ProviderBase.DbConnectionInternal> retry, System.Data.Common.DbConnectionOptions userOptions) Line 37            C#
System.Data.SqlClient.dll!System.Data.SqlClient.SqlConnection.TryOpen(System.Threading.Tasks.TaskCompletionSource<System.Data.ProviderBase.DbConnectionInternal> retry) Line 1122 C#
System.Data.SqlClient.dll!System.Data.SqlClient.SqlConnection.Open() Line 689  C#

Here it's making synchronous calls on the Socket, but for it to end up in that PerformSyncOperation method, either a) there must have been an asynchronous method previously called on the socket, or b) Socket.Blocking must have been set to false, and for it to be going through NetworkStream, it almost certainly isn't (b), as NetworkStream's ctor throws if a Socket with Blocking==false is passed in (so it would have had to been set to false after the NetworkStream was constructed).

wfurt commented 6 years ago

I plan to track it @saurabh500 . It is possible that the happens somewhere during SSL handshake. I still use VM you shared with me.

saurabh500 commented 6 years ago

If its the method name ReadSyncOverAsync then I can assure you that it always calls the sync read APIs of the NetworkStream. I think I should simply change this method name. It's caused much confusion.

@wfurt, please let us know if you saw any switches from async to sync usage of Socket in SqlClient. We would like to tackle them for 2.1

saurabh500 commented 6 years ago

Thanks @wfurt

stephentoub commented 6 years ago

If its the method name ReadSyncOverAsync

No, although that's misleading and it'd be nice to change it :)

It's the fact that PerformSyncOperation is being called; that only happens if the socket is in non-blocking mode, which happens when an asynchronous operation is made on the socket. It's called because we need to then simulate the sync behavior on top of async.

https://github.com/dotnet/corefx/blob/acb44e79ef7dc81817da9cf5f19c9819d0f491d1/src/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncContext.Unix.cs#L1112-L1137

saurabh500 commented 6 years ago

Hmm Interesting. Thanks for pointing this out. Chatting with @wfurt to figure this out.

stephentoub commented 6 years ago

@geoffkizer just reminded me that in 2.1 we also transition into non-blocking mode when a sync operation with a timeout is issued: https://github.com/dotnet/corefx/pull/22588. Does SqlClient do that?

saurabh500 commented 6 years ago

SqlClient does reads synchronously with a timeout set. https://github.com/dotnet/corefx/blob/master/src/System.Data.SqlClient/src/System/Data/SqlClient/SNI/SNITcpHandle.cs#L458

@stephentoub what I understand is that if the timeout is set, the socket goes into non-blocking mode, does that mean that all socket operations will be sync over async if a receive timeout is set on socket, while performing sync operations?

stephentoub commented 6 years ago

does that mean that all socket operations will be sync over async if a receive timeout is set on socket, while performing sync operations?

As of https://github.com/dotnet/corefx/pull/22588, yes, for better or worse. Setting a receive or send timeout on a socket now puts that socket into non-blocking mode, which means all subsequent synchronous send/receive operations actually end up being sync-over-async. I've opened https://github.com/dotnet/corefx/issues/29040 related to this.

David-Engel commented 6 years ago

This regression feels like a blocker for 2.1 RC1. Hopefully we can confirm today if PR dotnet/corefx#29097 fixes the issue.

keeratsingh commented 6 years ago

Resolved by PR dotnet/corefx#29097 (which was integrated into release/2.1 branch), closing this issue, I was able to successfully verify that this perf issue has been resolved with dotnet-sdk version 2.1.300-preview3-008646 version.

keerats@keerats:~/dotnetcore/adhoc$ dotnet --info
.NET Core SDK (reflecting any global.json):
Version:   2.1.300-preview3-008646
Commit:    8e01912b36

Runtime Environment:
OS Name:     ubuntu
OS Version:  16.04
OS Platform: Linux
RID:         ubuntu.16.04-x64
Base Path:   /usr/share/dotnet/sdk/2.1.300-preview3-008646/

Host (useful for support):
  Version: 2.1.0-rc1-26419-02
  Commit:  515f580aa0

.NET Core SDKs installed:
  2.1.300-preview3-008646 [/usr/share/dotnet/sdk]

.NET Core runtimes installed:
  Microsoft.AspNetCore.All 2.1.0-rc1-30613 [/usr/share/dotnet/shared/Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.App 2.1.0-rc1-30613 [/usr/share/dotnet/shared/Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 2.1.0-rc1-26419-02 [/usr/share/dotnet/shared/Microsoft.NETCore.App]

To install additional .NET Core runtimes or SDKs:
  https://aka.ms/dotnet-download