redis / go-redis

Redis Go client
https://redis.uptrace.dev
BSD 2-Clause "Simplified" License
19.65k stars 2.32k forks source link

[BUGS] data race in connection pool management #2876

Closed Aden-Q closed 5 months ago

Aden-Q commented 5 months ago

There is a data race issue in this newConn function. This function is not thread-safe. https://github.com/redis/go-redis/blob/2512123b7686dca1f7f0208eaada6cdf3ef0189c/internal/pool/pool.go#L166-L194

There's a shared variable p.poolSize.

There is a read access to p.poolSize on line 171: https://github.com/redis/go-redis/blob/2512123b7686dca1f7f0208eaada6cdf3ef0189c/internal/pool/pool.go#L171

There is a write access to p.poolSize on line 189: https://github.com/redis/go-redis/blob/2512123b7686dca1f7f0208eaada6cdf3ef0189c/internal/pool/pool.go#L189

While the mutex enforced on line 180 can only make writes mutually exclusive: https://github.com/redis/go-redis/blob/2512123b7686dca1f7f0208eaada6cdf3ef0189c/internal/pool/pool.go#L180-L181

In such a situation, there's a data race:

goroutine #1 is reading p.poolSize (on line 171, it does not need to acquire the mutex lock)

goroutine #2 is writing to p.poolSize (on line 189, it needs to acquire the mutex lock in order to write)

This shared variable p.poolSize is concurrently accessed by two goroutines, one read and one write, resulting in a data race issue

Expected Behavior

There should not be a data race issue when the newConn function is called

Current Behavior

There is a data race issue

Possible Solution

There is a data race because the mutex does not protect the full scope of the critical section, but only part of it. We should run p.connsMu.Lock() and defer p.connsMu.Unlock() earlier than it is, or use a RLock to protect read to the shared variable p.poolSize while RWLock to protect write

Steps to Reproduce

It's a data race and concurrency issue and a little hard to reproduce. While I can provide some output from the CI pipeline:

WARNING: DATA RACE
Write at 0x00c0001143c0 by goroutine 110:
  github.com/redis/go-redis/v9/internal/pool.(*ConnPool).newConn()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/internal/pool/pool.go:189 +0x3b6
  github.com/redis/go-redis/v9/internal/pool.(*ConnPool).Get()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/internal/pool/pool.go:285 +0x1bb
  github.com/redis/go-redis/v9/internal/pool.(*StickyConnPool).Get()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/internal/pool/pool_sticky.go:73 +0x129
  github.com/redis/go-redis/v9.(*baseClient)._getConn()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/redis.go:247 +0x74
  github.com/redis/go-redis/v9.(*baseClient).getConn()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/redis.go:235 +0xc4
  github.com/redis/go-redis/v9.(*baseClient).withConn()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/redis.go:358 +0x6b
  github.com/redis/go-redis/v9.(*baseClient)._process()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/redis.go:400 +0x1dd
  github.com/redis/go-redis/v9.(*baseClient).process()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/redis.go:382 +0xbb
  github.com/redis/go-redis/v9.(*baseClient).process-fm()
      <autogenerated>:1 +0x64
  github.com/redis/go-redis/v9.(*hooksMixin).processHook()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/redis.go:172 +0x71
  github.com/redis/go-redis/v9.(*Conn).Process()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/redis.go:803 +0x3a
  github.com/redis/go-redis/v9.(*Conn).Process-fm()
      <autogenerated>:1 +0x64
  github.com/redis/go-redis/v9.statefulCmdable.Select()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/commands.go:287 +0x1d7

Previous read at 0x00c0001143c0 by goroutine 102:
  github.com/redis/go-redis/v9/internal/pool.(*ConnPool).newConn()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/internal/pool/pool.go:171 +0xce
  github.com/redis/go-redis/v9/internal/pool.(*ConnPool).Get()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/internal/pool/pool.go:285 +0x1bb
  github.com/redis/go-redis/v9/internal/pool.(*StickyConnPool).Get()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/internal/pool/pool_sticky.go:73 +0x129
  github.com/redis/go-redis/v9.(*baseClient)._getConn()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/redis.go:247 +0x74
  github.com/redis/go-redis/v9.(*baseClient).getConn()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/redis.go:235 +0xc4
  github.com/redis/go-redis/v9.(*baseClient).withConn()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/redis.go:358 +0x6b
  github.com/redis/go-redis/v9.(*baseClient)._process()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/redis.go:400 +0x1dd
  github.com/redis/go-redis/v9.(*baseClient).process()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/redis.go:382 +0xbb
  github.com/redis/go-redis/v9.(*baseClient).process-fm()
      <autogenerated>:1 +0x64
  github.com/redis/go-redis/v9.(*hooksMixin).processHook()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/redis.go:172 +0x71
  github.com/redis/go-redis/v9.(*Conn).Process()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/redis.go:803 +0x3a
  github.com/redis/go-redis/v9.(*Conn).Process-fm()
      <autogenerated>:1 +0x64
  github.com/redis/go-redis/v9.statefulCmdable.Select()
      /home/circleci/go/pkg/mod/github.com/redis/go-redis/v9@v9.3.0/commands.go:287 +0x1d7

Context (Environment)

We see a data race warning when we are running test with the -race option

Detailed Description

Please check the Possible Solution section

Possible Implementation

Please check the Possible Solution section