mysql-net / MySqlConnector

MySQL Connector for .NET
https://mysqlconnector.net
MIT License
1.38k stars 330 forks source link

Help identify root cause of "Packet received out-of-order" errors #1343

Open Feandil opened 1 year ago

Feandil commented 1 year ago

Is your feature request related to a problem? Please describe.

We are infrequently getting "Packet received out-of-order" errors, all of which seem to be coming from the same code path, but we haven't been able to identify what is causing it. From https://github.com/mysql-net/MySqlConnector/issues/496#issuecomment-391583852, I understand that it could be due to multiple in-flight async requests, but we can't identify were the error is coming from.

Describe the solution you'd like

As suggested in the comment mentioned above, a meaningful error in case of multiple in-flight async requests, or at least detailed information about the two requests that ended up being simultaneous when the error is triggered.

Describe alternatives you've considered

Additional context

We are using:

Here is a trace of the error we are seeing:

MySqlConnector.MySqlProtocolException: Packet received out-of-order. Expected 37; got 49.
   at MySqlConnector.Protocol.Serialization.ProtocolUtility.<DoReadPayloadAsync>g__AddContinuation|5_0(ValueTask`1 readPacketTask, BufferedByteReader bufferedByteReader, IByteHandler byteHandler, Func`1 getNextSequenceNumber, ArraySegmentHolder`1 previousPayloads, ProtocolErrorBehavior protocolErrorBehavior, IOBehavior ioBehavior) in /_/src/MySqlConnector/Protocol/Serialization/ProtocolUtility.cs:line 494
   at MySqlConnector.Core.ServerSession.ReceiveReplyAsyncAwaited(ValueTask`1 task) in /_/src/MySqlConnector/Core/ServerSession.cs:line 960
   at MySqlConnector.Core.ResultSet.<ScanRowAsync>g__ScanRowAsyncAwaited|9_0(ResultSet resultSet, Task`1 payloadTask, Row row, CancellationToken token) in /_/src/MySqlConnector/Core/ResultSet.cs:line 244
   at MySqlConnector.Core.ResultSet.ReadAsync(IOBehavior ioBehavior, CancellationToken cancellationToken) in /_/src/MySqlConnector/Core/ResultSet.cs:line 215
   at MySqlConnector.Core.ResultSet.ReadEntireAsync(IOBehavior ioBehavior, CancellationToken cancellationToken) in /_/src/MySqlConnector/Core/ResultSet.cs:line 186
   at MySqlConnector.MySqlDataReader.NextResultAsync(IOBehavior ioBehavior, CancellationToken cancellationToken) in /_/src/MySqlConnector/MySqlDataReader.cs:line 61
   at MySqlConnector.MySqlDataReader.DisposeAsync(IOBehavior ioBehavior, CancellationToken cancellationToken) in /_/src/MySqlConnector/MySqlDataReader.cs:line 613
   at MySqlConnector.MySqlDataReader.Dispose(Boolean disposing) in /_/src/MySqlConnector/MySqlDataReader.cs:line 447
   at Microsoft.EntityFrameworkCore.Query.Internal.BufferedDataReader.Initialize(IReadOnlyList`1 columns)
   at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReader(RelationalCommandParameterObject parameterObject)
   at Microsoft.EntityFrameworkCore.Query.Internal.SingleQueryingEnumerable`1.Enumerator.InitializeReader(Enumerator enumerator)
   at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.<>c__DisplayClass31_0`2.<Execute>b__0(DbContext context, TState state)
   at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.ExecuteImplementation[TState,TResult](Func`3 operation, Func`3 verifySucceeded, TState state)
   at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.Execute[TState,TResult](TState state, Func`3 operation, Func`3 verifySucceeded)
   at Microsoft.EntityFrameworkCore.Query.Internal.SingleQueryingEnumerable`1.Enumerator.MoveNext()

The errors are then followed by "Can't replace active reader." errors, as I understand the exception is leaving the connection in a bad state.

bgrainger commented 1 year ago

Server:

  • ProxySQL 2.3.2

Would it be possible to run your application connecting directly to MySQL Server, bypassing ProxySQL, for a sufficiently long period of time that you would normally expect to see this error?

There have been ProxySQL bugs reported before (e.g., https://github.com/mysql-net/MySqlConnector/issues/842#issuecomment-652234390) so it could be useful to rule that out as a variable. If the problem happens when connecting MySqlConnector directly to MySQL Server, then it definitely increases the likelihood that the error could be in the client code.

Feandil commented 11 months ago

Thanks for the recommendation and sorry for the long delay in answering... It took us time to deploy Sentry to monitor the situation better and then try without ProxySQL. Disabling ProxySQL for almost a week now seems to indicate the we have no error, which would indicate that it is indeed the root cause of our issues...

Do you think we could extend MySqlConnector to collect more information about the out-of-order errors? Is there any chance getting the exact last command that was run before the crash in that connection? Or getting any detail from the packets received out of order?

Has anyone ever integrated MySqlConnector with Sentry? e.g. adding breadcrumbs to better understand what had happened before a crash?

bgrainger commented 11 months ago

Enabling MySqlConnector logging (at Debug or Trace) level might produce some helpful information about what is happening on each physical connection before the failure.

If Sentry can consume .NET Activity objects from an ActivitySource, then you should be able to get good tracing from the functionality built into MySqlConnector (https://github.com/mysql-net/MySqlConnector/issues/1036).

There are also packages out there that can convert ActivitySource to OpenTelemetry (or other export formats) if Sentry needs that, but I don't have a link to documentation at hand.

Feandil commented 9 months ago

Sorry again for the very long delay in answering. I've logged into these options:

Thanks again for your help.