slatedb / slatedb-go

A cloud native embedded storage engine built on object storage.
https://slatedb.io
Apache License 2.0
51 stars 8 forks source link

Data Race when accessing memtable.lastWalID #36

Open jayaprabhakar opened 1 day ago

jayaprabhakar commented 1 day ago

Go provides a basic race detector https://go.dev/blog/race-detector. When running the fizzbee's concurrency tests with race detector enabled, I get this error

==================
WARNING: DATA RACE
Write at 0x00c00000d160 by goroutine 81:
  github.com/slatedb/slatedb-go/slatedb.(*DB).flushImmWALToMemtable()
      /Users/jp/src/slatedb-go/slatedb/flush.go:92 +0x3e4
  github.com/slatedb/slatedb-go/slatedb.(*DB).flushImmWALs()
      /Users/jp/src/slatedb-go/slatedb/flush.go:66 +0x70
  github.com/slatedb/slatedb-go/slatedb.(*DB).FlushWAL()
      /Users/jp/src/slatedb-go/slatedb/flush.go:38 +0x40
  github.com/slatedb/slatedb-go/fizztest.(*SlateDbRoleAdapter).FlushWal()
      /Users/jp/src/slatedb-go/fizztest/fizztest_adapter.go:124 +0x38
  github.com/slatedb/slatedb-go/fizztest_test.performOperations()
      /Users/jp/src/slatedb-go/fizztest/mbt_test.go:642 +0xc20
  github.com/slatedb/slatedb-go/fizztest_test.TestLinearizability.gowrap2()
      /Users/jp/src/slatedb-go/fizztest/mbt_test.go:455 +0x70

Previous read at 0x00c00000d160 by goroutine 80:
  github.com/slatedb/slatedb-go/slatedb.(*DB).FlushMemtableToL0()
      /Users/jp/src/slatedb-go/slatedb/db.go:323 +0x60
  github.com/slatedb/slatedb-go/fizztest.(*SlateDbRoleAdapter).FlushMemtable()
      /Users/jp/src/slatedb-go/fizztest/fizztest_adapter.go:128 +0x38
  github.com/slatedb/slatedb-go/fizztest_test.performOperations()
      /Users/jp/src/slatedb-go/fizztest/mbt_test.go:647 +0x9e8
  github.com/slatedb/slatedb-go/fizztest_test.TestLinearizability.gowrap2()
      /Users/jp/src/slatedb-go/fizztest/mbt_test.go:455 +0x70

Goroutine 81 (running) created at:
  github.com/slatedb/slatedb-go/fizztest_test.TestLinearizability()
      /Users/jp/src/slatedb-go/fizztest/mbt_test.go:455 +0xc6c
  testing.tRunner()
      /Users/jp/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.1.darwin-arm64/src/testing/testing.go:1690 +0x184
  testing.(*T).Run.gowrap1()
      /Users/jp/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.1.darwin-arm64/src/testing/testing.go:1743 +0x40

Goroutine 80 (finished) created at:
  github.com/slatedb/slatedb-go/fizztest_test.TestLinearizability()
      /Users/jp/src/slatedb-go/fizztest/mbt_test.go:455 +0xc6c
  testing.tRunner()
      /Users/jp/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.1.darwin-arm64/src/testing/testing.go:1690 +0x184
  testing.(*T).Run.gowrap1()
      /Users/jp/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.1.darwin-arm64/src/testing/testing.go:1743 +0x40
==================

The problematic code is here,

func (db *DB) FlushMemtableToL0() error {
    lastWalID := db.state.memtable.lastWalID     // <-- access lastWalID without a lock

        // ... 

and

func (db *DB) flushImmWALToMemtable(immWal ImmutableWAL, memtable *Memtable) {
    // ...

    memtable.lastWalID = mo.Some(immWal.id)  //    <-- sets lastWalID field without acquiring a lock
}
naveen246 commented 17 hours ago

I think this issue will be resolved once this PR https://github.com/slatedb/slatedb-go/pull/34 is merged.

In the updated code, we now acquire lock before accessing lastWalID

func (m *Memtable) LastWalID() mo.Option[uint64] {
    m.RLock()
    defer m.RUnlock()
    return m.lastWalID
}

func (m *Memtable) SetLastWalID(lastWalID uint64) {
    m.Lock()
    defer m.Unlock()
    m.lastWalID = mo.Some(lastWalID)
}
naveen246 commented 10 hours ago

@jayaprabhakar https://github.com/slatedb/slatedb-go/pull/34 is merged and this should be resolved. Can you please test once with the latest code