edgedb / edgedb-net

The official .NET client library for EdgeDB
https://edgedb.com
Apache License 2.0
82 stars 9 forks source link

Fix: Reconnection logic for parse #80

Closed quinchs closed 6 months ago

quinchs commented 6 months ago

Abstract

When parsing, clients can receive different indications that tell the binding to retry parsing ex: state mismatch, etc; There's also the case when a connection is 'stale' (the socket is open but EdgeDB sent an ErrorResponse with the code IdleSessionTimeoutError), when the binding attempts to parse in this state, it will read the session timeout error and reconnect. The problem though is the ParseAsync method didn't really have the correct state for handling reconnection, the underlying protocol providers' phase was not correctly reset causing the ReconnectAsync function to end early on the servers Authentication message. Connection flow should look like the following:

PHASE: Command
C -> S: Parse
C -> S: Sync
S -> C: ErrorResponse (IdleSessionTimeoutError)
~ Connection Terminated~
PHASE: Connection
C -> S: ClientHandshake
S -> C: Authentication
~ Client server auth flow ~
S -> C: StateDataDescription
S -> C: N ParameterStatus
S -> C: ReadyForCommand
PHASE: Command
C -> S: Parse
C -> S: Sync
...

In this example, Parse should be sent after the binding receives ReadyForCommand which indicated that the Connection phase is complete, but in actuality the binding thought it was already in the Command phase, so this is what the flow actually looked like:

PHASE: Command
C -> S: Parse
C -> S: Sync
S -> C: ErrorResponse (IdleSessionTimeoutError)
~ Connection Terminated~
C -> S: ClientHandshake
S -> C: Authentication
C -> S: Parse
C -> S: Sync

After sending the Parse and Sync, it would wait for IProtocolProvider.Phase to be Command, indicating that command actions can be sent, but since this is never reset, it completes on the first message.

Ultimately, this edge case caused the following exception in apps:

EdgeDB.EdgeDBException: Failed to execute query after retrying once
 ---> EdgeDB.MissingCodecException: Couldn't find a valid output codec
   at EdgeDB.Binary.Protocol.V1._0.V1ProtocolProvider.ParseQueryAsync(QueryParameters queryParameters, CancellationToken token)
   at EdgeDB.EdgeDBBinaryClient.ExecuteInternalAsync(String query, IDictionary`2 args, Nullable`1 cardinality, Nullable`1 capabilities, IOFormat format, Boolean isRetry, Boolean implicitTypeName, Func`2 preheat, CancellationToken token)
   --- End of inner exception stack trace ---

which is generally misleading to the actual cause.

Summary

This PR fixes this issue by resetting the protocol providers' state whenever trying to run ConnectInternalAsync, and properly waits for the phase to switch to Command before returning out of ConnectAsync, causing any command phase actions to wait for this to be preformed.

quinchs commented 6 months ago

Other bindings might also have this edgecase, Java most likely since the protocol provider code is very similar