ServerSession.VerifyConnected() takes a lot of CPU TIME

razzmatazz commented 2 years ago

Software versions MySqlConnector version: 2.1.0 Server type (MySQL, MariaDB, Aurora, etc.) and version: GCP/mysql8 .NET version: 6.0 ORM NuGet packages and versions: NHibernate/5.3

Describe the bug When profiling a high-cpu background processing server I am seeing a lot of threads waiting on VerifyConnected(). I am not sure why this is happening as VerifyConnected should not block often, from reading the source code, but it does for me.

Exception Full exception message and call stack (if applicable)

Example trace from one of the threads via dotnet stack report -p1

Thread (0xC66):
  CPU_TIME
  ManagedModule!MySqlConnector.Core.ServerSession.VerifyConnected()
  ManagedModule!MySqlConnector.Core.ServerSession.ReceiveReplyAsync(value class MySqlConnector.Protocol.Serialization.IOBehavior,value class System.Threading.CancellationToken)
  ManagedModule!MySqlConnector.Core.ResultSet+<ReadResultSetHeaderAsync>d__2.MoveNext()
  ManagedModule!System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start(!!0&)
  ManagedModule!MySqlConnector.Core.ResultSet.ReadResultSetHeaderAsync(value class MySqlConnector.Protocol.Serialization.IOBehavior)
  ManagedModule!MySqlConnector.MySqlDataReader+<CreateAsync>d__110.MoveNext()
  ManagedModule!System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start(!!0&)
  ManagedModule!MySqlConnector.MySqlDataReader.CreateAsync(value class MySqlConnector.Core.CommandListPosition,class MySqlConnector.Core.ICommandPayloadCreator,class System.Collections.Generic.IDictionary`2<class System.String,class MySqlConnector.Core.CachedProcedure>,class MySqlConnector.Core.IMySqlCommand,value class System.Data.CommandBehavior,class System.Diagnostics.Activity,value class MySqlConnector.Protocol.Serialization.IOBehavior,value class System.Threading.CancellationToken)
  ManagedModule!MySqlConnector.Core.CommandExecutor+<ExecuteReaderAsync>d__0.MoveNext()
  ManagedModule!System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start(!!0&)
  ManagedModule!MySqlConnector.MySqlCommand.ExecuteReaderNoResetTimeoutAsync(value class System.Data.CommandBehavior,value class MySqlConnector.Protocol.Serialization.IOBehavior,value class System.Threading.CancellationToken)
  ManagedModule!MySqlConnector.MySqlCommand+<ExecuteReaderAsync>d__84.MoveNext()
  ManagedModule!System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start(!!0&)
  ManagedModule!MySqlConnector.MySqlCommand.ExecuteReaderAsync(value class System.Data.CommandBehavior,value class MySqlConnector.Protocol.Serialization.IOBehavior,value class System.Threading.CancellationToken)
  ManagedModule!MySqlConnector.MySqlCommand+<ExecuteDbDataReaderAsync>d__83.MoveNext()
  ManagedModule!System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start(!!0&)
  ManagedModule!MySqlConnector.MySqlCommand.ExecuteDbDataReaderAsync(value class System.Data.CommandBehavior,value class System.Threading.CancellationToken)
  ManagedModule!NHibernate.AdoNet.AbstractBatcher+<DoExecuteReaderAsync>d__76.MoveNext()

Expected behavior It should not spend a lot of time in VerifyConnected

Additional context I am not sure what is at fault here, maybe profiler is showing incorrect data or NHibernate is doing something funny with ADO.NET connections (i.e. why is there contention in VeryConnected at all?!)

Need pointers how to debug this further.

BTW, thanks for the software!

bgrainger commented 2 years ago

It's intended that lock (m_lock) (in VerifyConnected) should be very quick; however, I suppose if there were contention and the lightweight .NET CAS locking mechanism got promoted to a full kernel object, this could end up taking much more time than anticipated?

I'm not even sure how likely that would be, since ServerSession is generally intended to be used by only a single thread, so there shouldn't be contention. Might have to profile and see if I can reproduce.

Are you running on Linux or Windows?

razzmatazz commented 2 years ago

This is on Linux.

Yes.. it could be that slim lock gets promoted to kernel lock due to high CPU usage, but I am no kernel engineer :)

Do you need a profile in some form?

razzmatazz commented 2 years ago

this appears to be lock contention+promotion to kernel lock, like you said, under high cpu load only, no worries and thanks for this project

16pierre commented 2 days ago

We're seeing similar problems where ServerSession.VerifyConnected() shows up as a massive hotspot (Linux containers as well). Happy to send more details.

I'm skeptical that contention is a problem here, not only is the lock local to each session (and in our case we use a single thread per session), but this issue also happens on single-threaded benchmarks where we open a single session.

After forking the client and removing the VerifyConnected sanity-check, we observed > 5x latency / throughput improvement on the client-side on our benchmarks.

Not exactly sure how to fix this cleanly though. Using a volatile m_state and not locking upon reading to perform the sanity-checks also fixed the performance degradation, it's debatable whether this is correct though. Tempted to think that locking upon performing sanity-checks doesn't bring much value: this doesn't prevent race-conditions where the session state changes either before/after performing the sanity-check.

bgrainger commented 2 days ago

this issue also happens on single-threaded benchmarks where we open a single session

Are you able to share this benchmark code?

16pierre commented 2 days ago

Are you able to share this benchmark code?

I can't share the exact benchmark code we use as it contains business logic and a whole parametrisation/multi-threading framework, but essentially our benchmark script is a simple scan over the data using connection#ExecuteReaderAsync, here's a rough example:

DbConnection connection;

var createTableQuery = "CREATE TABLE benchmarkTable(col1 BINARY(16) NOT NULL, col2 BINARY(16) NOT NULL, col3 BINARY(16) NOT NULL);"
await connection.ExecuteAsync(createTableQuery);

// Populate the benchmarkTable, 1 - 10 millions rows for example
// We skip this step here, in our case we use optimized CSV imports

// Perform a simple scan over the benchmarkTable
// Note also that we use infiniteCommandTimeout to bypass `TimerQueue#Add` 's lock,
// which creates high contention on multi-threaded reads, but that's a separate problem

var selectQuery = "SELECT col1, col2, col3 FROM benchmarkTable;"
await using var dbDataReader = await connection.ExecuteReaderAsync(selectQuery, commandTimeout:  0);

while (await dbDataReader.ReadAsync()) {
     // read the fields to imitate production workload
     // this is optional though because data stream seems to get processed upon ReadAsync already
     byte[] buffer = new byte[16];
     for (int i = 0; i < 3; i++)
          dbDataReader.GetBytes(i, 0, buffer, 0, 16);
}

16pierre commented 2 days ago

More information about our benchmarking setup:

these benchmarks were run on a test kubernetes environment, for example using base docker image mcr.microsoft.com/dotnet/aspnet:8.0. The underlying OS should be Debian 12 (cf https://github.com/dotnet/dotnet-docker/blob/main/README.aspnet.md)
our underlying DB here is Singlestore. Note that we use their forked client https://github.com/memsql/SingleStoreNETConnector, which didn't follow the latest updates on upstream MySqlConnector, but the lock logic doesn't seem to have changed in CreateExceptionForInvalidState (which replaced VerifyConnected), so I'd expect this issue to be reproducible on other types of DBs . One important element of our setup is that our benchmarks were very far from saturating the DB, the throughput was mainly bottlenecked client-side.
here's the lock-free implementation that we tried which improved throughput: https://github.com/memsql/SingleStoreNETConnector/pull/6

16pierre commented 2 days ago

Still investigating. At this point, not 100% sure that the lock is the only problem, we suspect impact of dotnet versions on the original Singlestore connector release we were using - rebuilding the original code locally (without the lock-free fix) yields the same improved performance as the lock-free implementation, so we suspect dotnet differences between our local build and their published builds.

Details to follow, this issue may be specific to the Singlestore fork's builds, in which case I apologize for the noise.

mysql-net / MySqlConnector

ServerSession.VerifyConnected() takes a lot of CPU TIME #1153