jellydator / ttlcache

An in-memory cache with item expiration and generics
MIT License
881 stars 115 forks source link

`Items()` is not thread safe #139

Open pdddddddddddd opened 1 month ago

pdddddddddddd commented 1 month ago

Hi all, As per the readme, this package is thread safe, but I've noticed a data race in Items().

A simple example:

func TestItems(t *testing.T) {
    cache := ttlcache.New(
        ttlcache.WithTTL[int, int](2*time.Second),
        ttlcache.WithDisableTouchOnHit[int, int](),
    )
    cache.Set(1, 1, ttlcache.DefaultTTL)
    cache.Set(2, 2, ttlcache.DefaultTTL)

    var wg sync.WaitGroup
    go func() {
        defer wg.Done()
        _ = cache.Items()
    }()
    go func() {
        defer wg.Done()
        _ = cache.Items()
    }()

    wg.Wait()
}

Output:

go test -race -v -failfast -run TestItems -count=1
=== RUN   TestItems
--- PASS: TestItems (0.00s)
==================
WARNING: DATA RACE
Read at 0x00c000100480 by goroutine 7:
  container/list.(*List).MoveToFront()
      /opt/homebrew/Cellar/go/1.22.3/libexec/src/container/list/list.go:181 +0x50
  github.com/jellydator/ttlcache/v3.(*Cache[go.shape.int,go.shape.int]).get()
      /Users/pdddddddddddd/go/pkg/mod/github.com/jellydator/ttlcache/v3@v3.2.0/cache.go:190 +0x104
  github.com/jellydator/ttlcache/v3.(*Cache[go.shape.int,go.shape.int]).Items()
      /Users/pdddddddddddd/go/pkg/mod/github.com/jellydator/ttlcache/v3@v3.2.0/cache.go:472 +0x1b4
  ttlrace.TestItems.func1()
      /Users/pdddddddddddd/ttlrace/race_test.go:22 +0x7c

Previous write at 0x00c000100480 by goroutine 8:
  container/list.(*List).move()
      /opt/homebrew/Cellar/go/1.22.3/libexec/src/container/list/list.go:127 +0x1ec
  container/list.(*List).MoveToFront()
      /opt/homebrew/Cellar/go/1.22.3/libexec/src/container/list/list.go:185 +0x78
  github.com/jellydator/ttlcache/v3.(*Cache[go.shape.int,go.shape.int]).get()  github.com/jellydator/ttlcache/v3.(*Cache[go.shape.int,go.shape.int]).Items()
      /Users/pdddddddddddd/go/pkg/mod/github.com/jellydator/ttlcache/v3@v3.2.0/cache.go:472 +0x1b4
  ttlrace.TestItems.func2()
      /Users/pdddddddddddd/ttlrace/race_test.go:26 +0x7c
panic: 
Goroutine 7 (running) created at:
  ttlrace.TestItems()
      /Users/pdddddddddddd/ttlrace/race_test.go:20 +0x26c
sync: negative WaitGroup counter
  testing.tRunner()
      /opt/homebrew/Cellar/go/1.22.3/libexec/src/testing/testing.go:1689 +0x180

  testing.(*T).Run.gowrap1()
      /opt/homebrew/Cellar/go/1.22.3/libexec/src/testing/testing.go:1742 +0x40
goroutine 
5Goroutine 8 (running) created at:
 [running  ttlrace.TestItems()
      /Users/pdddddddddddd/ttlrace/race_test.go:24 +0x30c
]:
  testing.tRunner()
      /opt/homebrew/Cellar/go/1.22.3/libexec/src/testing/testing.go:1689 +0x180
  testing.(*T).Run.gowrap1()
      /opt/homebrew/Cellar/go/1.22.3/libexec/src/testing/testing.go:1742 +0x40
==================
sync.(*WaitGroup).Add(0xc00000e320, 0xffffffffffffffff)
        /opt/homebrew/Cellar/go/1.22.3/libexec/src/sync/waitgroup.go:62 +0x1a4
sync.(*WaitGroup).Done(0xc00000e320)
        /opt/homebrew/Cellar/go/1.22.3/libexec/src/sync/waitgroup.go:87 +0x30
ttlrace.TestItems.func2()
        /Users/pdddddddddddd/ttlrace/race_test.go:27 +0x84
created by ttlrace.TestItems in goroutine 3
        /Users/pdddddddddddd/ttlrace/race_test.go:24 +0x310
exit status 2
FAIL    ttlrace 0.324s

From a quick skim, this appears to be due to the fact that .Items() holds an RLock() on the mutex, but when .Items() calls get, it does c.items.lru.MoveToFront(elem) which mutates the underlying list, as hinted at by this comment on get:

// Not safe for concurrent use by multiple goroutines without additional
// locking.

I suppose a simple fix would be to change .Items() to this instead?

func (c *Cache[K, V]) Items() map[K]*Item[K, V] {
    c.items.mu.RLock()
    defer c.items.mu.RUnlock()
...
swithek commented 1 month ago

Thank you for reporting this @pdddddddddddd. I think this method should not perform any modifications at all. So we should probably change Items()'s logic to something like what Keys() is using (i.e., we should not use the get() method internally).