There are occasions when a connection handling a TxPipeline gets closed unexpectedly (maybe because the remote server shut down). Despite the pool's attempts to discard bad idle connections, this validation is not 100% reliable (since the connection can be closed after it's returned from the pool and while it's being used). In my experience this results in an EOF error not while writing the commands on the connection but only once attempting to read the result.
Expected Behavior
When using a ClusterClient.TxPipeline, pipelines should be retried when an EOF (or another retryable error) is returned while reading the pipeline responses.
Update txPipelineReadQueued and the error handling for it in processTxPipelineNodeConn to account for retryable errors and update the failedCmds parameter appropriately.
Steps to Reproduce
Let me know if there's appetite for addressing this and I can work on a repro.
There are occasions when a connection handling a TxPipeline gets closed unexpectedly (maybe because the remote server shut down). Despite the pool's attempts to discard bad idle connections, this validation is not 100% reliable (since the connection can be closed after it's returned from the pool and while it's being used). In my experience this results in an EOF error not while writing the commands on the connection but only once attempting to read the result.
Expected Behavior
When using a
ClusterClient.TxPipeline
, pipelines should be retried when an EOF (or another retryable error) is returned while reading the pipeline responses.Current Behavior
ClusterClient.TxPipeline
will retry if an EOF is observed while writing the commands to the connection: https://github.com/redis/go-redis/blob/f3fe61148b2b8fe0a669dc23620690407f5f92af/osscluster.go#L1497-L1505However, when reading the response from the connection in https://github.com/redis/go-redis/blob/f3fe61148b2b8fe0a669dc23620690407f5f92af/osscluster.go#L1508-L1525 and https://github.com/redis/go-redis/blob/f3fe61148b2b8fe0a669dc23620690407f5f92af/osscluster.go#L1537-L1547.
In particular, there may be an EOF in these circumstances:
MULTI
: https://github.com/redis/go-redis/blob/f3fe61148b2b8fe0a669dc23620690407f5f92af/osscluster.go#L1537EXEC
: https://github.com/redis/go-redis/blob/f3fe61148b2b8fe0a669dc23620690407f5f92af/osscluster.go#L1550However, only the last case handles retryable errors and updates
failedCmds
to trigger the retry machinery: https://github.com/redis/go-redis/blob/f3fe61148b2b8fe0a669dc23620690407f5f92af/osscluster.go#L1354-L1360Possible Solution
Update
txPipelineReadQueued
and the error handling for it inprocessTxPipelineNodeConn
to account for retryable errors and update thefailedCmds
parameter appropriately.Steps to Reproduce
Let me know if there's appetite for addressing this and I can work on a repro.
1. 2. 3. 4.
Context (Environment)
Redis Server: 7.1 go-redis client: 9.5.1
Detailed Description
Possible Implementation