goroutine #1 is reading p.poolSize (on line 171, it does not need to acquire the mutex lock)
goroutine #2 is writing to p.poolSize (on line 189, it needs to acquire the mutex lock in order to write)
This shared variable p.poolSize is concurrently accessed by two goroutines, one read and one write, resulting in a data race issue
Expected Behavior
There should not be a data race issue when the newConn function is called
Current Behavior
There is a data race issue
Possible Solution
There is a data race because the mutex does not protect the full scope of the critical section, but only part of it. We should run p.connsMu.Lock() and defer p.connsMu.Unlock() earlier than it is, or use a RLock to protect read to the shared variable p.poolSize while RWLock to protect write
Steps to Reproduce
It's a data race and concurrency issue and a little hard to reproduce. While I can provide some output from the CI pipeline:
There is a data race issue in this
newConn
function. This function is not thread-safe. https://github.com/redis/go-redis/blob/2512123b7686dca1f7f0208eaada6cdf3ef0189c/internal/pool/pool.go#L166-L194There's a shared variable
p.poolSize
.There is a read access to
p.poolSize
on line 171: https://github.com/redis/go-redis/blob/2512123b7686dca1f7f0208eaada6cdf3ef0189c/internal/pool/pool.go#L171There is a write access to
p.poolSize
on line 189: https://github.com/redis/go-redis/blob/2512123b7686dca1f7f0208eaada6cdf3ef0189c/internal/pool/pool.go#L189While the mutex enforced on line 180 can only make writes mutually exclusive: https://github.com/redis/go-redis/blob/2512123b7686dca1f7f0208eaada6cdf3ef0189c/internal/pool/pool.go#L180-L181
In such a situation, there's a data race:
goroutine #1 is reading
p.poolSize
(on line 171, it does not need to acquire the mutex lock)goroutine #2 is writing to
p.poolSize
(on line 189, it needs to acquire the mutex lock in order to write)This shared variable
p.poolSize
is concurrently accessed by two goroutines, one read and one write, resulting in a data race issueExpected Behavior
There should not be a data race issue when the
newConn
function is calledCurrent Behavior
There is a data race issue
Possible Solution
There is a data race because the mutex does not protect the full scope of the critical section, but only part of it. We should run
p.connsMu.Lock()
anddefer p.connsMu.Unlock()
earlier than it is, or use aRLock
to protect read to the shared variablep.poolSize
whileRWLock
to protect writeSteps to Reproduce
It's a data race and concurrency issue and a little hard to reproduce. While I can provide some output from the CI pipeline:
Context (Environment)
We see a data race warning when we are running test with the
-race
optionDetailed Description
Please check the
Possible Solution
sectionPossible Implementation
Please check the
Possible Solution
section