blugelabs / bluge

indexing library for Go
Apache License 2.0
1.88k stars 122 forks source link

Made in memory directory abstraction thread-safe, fixes #41 #42

Closed voldyman closed 3 years ago

voldyman commented 3 years ago

Added a RW lock to avoid concurrent writes to segment map.

I couldn't create a reliable test for this, i was encountering this race condition every 3rd time and haven't encountered it since the fix.

(I am already in the AUTHORS file)

voldyman commented 3 years ago

umm, lint complains about the return values of Lock & Unlock but these methods don't return anything.

https://godoc.org/sync#RWMutex.Lock

voldyman commented 3 years ago

Created a test to reproduce the error, stacktrace:

╰─➤  go test -race -run TestInMemoryWriterDataRace
==================
WARNING: DATA RACE
Write at 0x00c00037cf00 by goroutine 14:
  runtime.mapassign_fast64()
      /usr/local/Cellar/go/1.15.6/libexec/src/runtime/map_fast64.go:92 +0x0
  github.com/blugelabs/bluge/index.(*InMemoryDirectory).Persist()
      /Users/voldyman/dev/bluge/index/directory_mem.go:79 +0x13a
  github.com/blugelabs/bluge/index.(*Writer).merge()
      /Users/voldyman/dev/bluge/index/merge.go:368 +0x224
  github.com/blugelabs/bluge/index.(*Writer).executeMergeTask()
      /Users/voldyman/dev/bluge/index/merge.go:144 +0x86a
  github.com/blugelabs/bluge/index.(*Writer).planMergeAtSnapshot()
      /Users/voldyman/dev/bluge/index/merge.go:118 +0x474
  github.com/blugelabs/bluge/index.(*Writer).mergerLoop()
      /Users/voldyman/dev/bluge/index/merge.go:56 +0x4b1

Previous write at 0x00c00037cf00 by goroutine 13:
  runtime.mapassign_fast64()
      /usr/local/Cellar/go/1.15.6/libexec/src/runtime/map_fast64.go:92 +0x0
  github.com/blugelabs/bluge/index.(*InMemoryDirectory).Persist()
      /Users/voldyman/dev/bluge/index/directory_mem.go:79 +0x13a
  github.com/blugelabs/bluge/index.(*Writer).persistSnapshotDirect()
      /Users/voldyman/dev/bluge/index/persister.go:316 +0x2d9
  github.com/blugelabs/bluge/index.(*Writer).persistSnapshot()
      /Users/voldyman/dev/bluge/index/persister.go:233 +0x93
  github.com/blugelabs/bluge/index.(*Writer).persisterLoop()
      /Users/voldyman/dev/bluge/index/persister.go:81 +0x70b

Goroutine 14 (running) created at:
  github.com/blugelabs/bluge/index.OpenWriter()
      /Users/voldyman/dev/bluge/index/writer.go:131 +0xf6d
  github.com/blugelabs/bluge.OpenWriter()
      /Users/voldyman/dev/bluge/writer.go:36 +0x137
  github.com/blugelabs/bluge.TestInMemoryWriterDataRace()
      /Users/voldyman/dev/bluge/mem_directory_datarace_test.go:13 +0xb7
  testing.tRunner()
      /usr/local/Cellar/go/1.15.6/libexec/src/testing/testing.go:1123 +0x202

Goroutine 13 (running) created at:
  github.com/blugelabs/bluge/index.OpenWriter()
      /Users/voldyman/dev/bluge/index/writer.go:129 +0xf14
  github.com/blugelabs/bluge.OpenWriter()
      /Users/voldyman/dev/bluge/writer.go:36 +0x137
  github.com/blugelabs/bluge.TestInMemoryWriterDataRace()
      /Users/voldyman/dev/bluge/mem_directory_datarace_test.go:13 +0xb7
  testing.tRunner()
      /usr/local/Cellar/go/1.15.6/libexec/src/testing/testing.go:1123 +0x202
==================
--- FAIL: TestInMemoryWriterDataRace (0.03s)
    testing.go:1038: race detected during execution of test
FAIL
exit status 1
FAIL    github.com/blugelabs/bluge  0.305s

My original commit still had the data race because it accidentally called Lock/Unlock from current struct and not the mutex.

mschoch commented 3 years ago

Thanks I plan to take a closer look tomorrow.

mschoch commented 3 years ago

I pushed another commit moving the test into the existing index_test.go. I find that its helpful to not have unnecessary top-level go files in the project.

mschoch commented 3 years ago

Regarding the underlying issue, I think I was hoping to be able to do this without a lock. But obviously even the simple case of the merger and the persister writing to the map is a problem. So, until I have something better to propose, we'll merge this so that the in-memory writer can be used safely.