yarpc / yarpc-go

A message passing platform for Go
MIT License
401 stars 101 forks source link

Upgrade grpc-go to v1.44.0; fixed grpc connection status mapper #2278

Closed biosvs closed 1 month ago

biosvs commented 1 month ago

Since grpc-go 1.41.0, failed connections are moved to IDLE state without connection retries. Reconnection attempt is happening either by direct call to Connect method, or with an attempt to make a request using given connection.

Yarpc-go library is built with an expectation of automatic reconnections done by grpc, and only grpc-go connections in READY state are used for outbound requests. READY grpc state is mapped to Available yarpc status, Connecting - to Connecting, all other (IDLE, TransientFailure, Shutdown) - to Unavailable.

Without the fix presented in this PR, eventually all grpc-go connections may be moved to IDLE state, which maps to Unavailable state from yarpc perspective. It creates a deadlock: yarpc waits for connection to be moved to READY state; meanwhile grpc keeps IDLE state and wait for the rpc call or explicit Connect call.

This PR introduces an explicit Connect call on connections that are moved to IDLE state. It forces grpc to try reconnection, which moves connection into Connecting state (for both grpc and yarpc). If connection succeed, state will be set to READY (yarpc: Available). If failed, new state is TransientFailure (yarpc: Unavailable), then IDLE (moved by grpc-go; yarpc: Unavailable; triggers a reconnection attempt one more time).

We don't have to implement reconnection backoff manually, because it's part of the reconnection logic in grpc library.

This PR replaces #2247

codecov[bot] commented 1 month ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 85.21%. Comparing base (0b3b4d6) to head (e39f610).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## dev #2278 +/- ## ======================================= Coverage 85.21% 85.21% ======================================= Files 270 270 Lines 15555 15557 +2 ======================================= + Hits 13255 13257 +2 + Misses 1877 1876 -1 - Partials 423 424 +1 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.