blugelabs / bluge

indexing library for Go
Apache License 2.0
1.9k stars 125 forks source link

ice v2 data race #121

Open mschoch opened 2 years ago

mschoch commented 2 years ago

Started seeing a bunch of races detected now as well:

WARNING: DATA RACE
Read at 0x00c0000be298 by goroutine 14:
  github.com/blugelabs/ice/v2.(*Segment).getDocStoredOffsets()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/read.go:46 +0x208
  github.com/blugelabs/ice/v2.(*Segment).getDocStoredMetaAndUnCompressed()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/read.go:22 +0x64
  github.com/blugelabs/ice/v2.(*Segment).visitDocument()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/segment.go:197 +0xca
  github.com/blugelabs/ice/v2.mergeStoredAndRemapSegment()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/merge.go:724 +0x36b
  github.com/blugelabs/ice/v2.mergeStoredAndRemap()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/merge.go:674 +0xa07
  github.com/blugelabs/ice/v2.mergeToWriter()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/merge.go:130 +0x204
  github.com/blugelabs/ice/v2.mergeSegmentBasesWriter()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/merge.go:96 +0x157
  github.com/blugelabs/ice/v2.merge()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/merge.go:85 +0x1d1
  github.com/blugelabs/ice/v2.(*Merger).WriteTo()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/merge.go:48 +0x190
  github.com/blugelabs/bluge/index.(*FileSystemDirectory).Persist()
      /home/runner/work/bluge/bluge/index/directory_fs.go:125 +0x2d4
  github.com/blugelabs/bluge/index.(*Writer).merge()
      /home/runner/work/bluge/bluge/index/merge.go:368 +0x224
  github.com/blugelabs/bluge/index.(*Writer).executeMergeTask()
      /home/runner/work/bluge/bluge/index/merge.go:144 +0x86a
  github.com/blugelabs/bluge/index.(*Writer).planMergeAtSnapshot()
      /home/runner/work/bluge/bluge/index/merge.go:118 +0x474
  github.com/blugelabs/bluge/index.(*Writer).mergerLoop()
      /home/runner/work/bluge/bluge/index/merge.go:56 +0x4c4

Previous write at 0x00c0000be298 by goroutine 26:
  github.com/blugelabs/ice/v2.(*Segment).getDocStoredOffsets()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/read.go:47 +0x30e
  github.com/blugelabs/ice/v2.(*Segment).getDocStoredMetaAndUnCompressed()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/read.go:22 +0x64
  github.com/blugelabs/ice/v2.(*Segment).visitDocument()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/segment.go:197 +0xca
  github.com/blugelabs/ice/v2.(*Segment).VisitStoredFields()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/segment.go:190 +0x104
  github.com/blugelabs/bluge/index.(*segmentSnapshot).VisitDocument()
      /home/runner/work/bluge/bluge/index/segment.go:61 +0x3d4
  github.com/blugelabs/bluge/index.(*Snapshot).VisitStoredFields()
      /home/runner/work/bluge/bluge/index/snapshot.go:307 +0x2a7
  github.com/blugelabs/bluge/search.(*DocumentMatch).VisitStoredFields()
      /home/runner/work/bluge/bluge/search/search.go:134 +0x38e
  github.com/blugelabs/bluge/test.collectHits()
      /home/runner/work/bluge/bluge/test/integration_test.go:46 +0x2ae
  github.com/blugelabs/bluge/test.TestIntegration.func1()
      /home/runner/work/bluge/bluge/test/integration_test.go:150 +0x364
  testing.tRunner()
      /opt/hostedtoolcache/go/1.15.15/x64/src/testing/testing.go:1123 +0x202

Goroutine 14 (running) created at:
  github.com/blugelabs/bluge/index.OpenWriter()
      /home/runner/work/bluge/bluge/index/writer.go:131 +0xf8d
  github.com/blugelabs/bluge.OpenWriter()
      /home/runner/work/bluge/bluge/writer.go:36 +0x137
  github.com/blugelabs/bluge/test.TestIntegration()
      /home/runner/work/bluge/bluge/test/integration_test.go:127 +0x784
  testing.tRunner()
      /opt/hostedtoolcache/go/1.15.15/x64/src/testing/testing.go:1123 +0x202

Goroutine 26 (finished) created at:
  testing.(*T).Run()
      /opt/hostedtoolcache/go/1.15.15/x64/src/testing/testing.go:1168 +0x5bb
  github.com/blugelabs/bluge/test.TestIntegration()
      /home/runner/work/bluge/bluge/test/integration_test.go:141 +0x207
  testing.tRunner()
      /opt/hostedtoolcache/go/1.15.15/x64/src/testing/testing.go:1123 +0x202
==================
==================
WARNING: DATA RACE
Write at 0x00c0007b8000 by goroutine 14:
  runtime.slicecopy()
      /opt/hostedtoolcache/go/1.15.15/x64/src/runtime/slice.go:246 +0x0
  github.com/klauspost/compress/zstd.(*blockDec).decodeCompressed()
      /home/runner/go/pkg/mod/github.com/klauspost/compress@v1.15.2/zstd/blockdec.go:494 +0x207
  github.com/klauspost/compress/zstd.(*blockDec).decodeBuf()
      /home/runner/go/pkg/mod/github.com/klauspost/compress@v1.15.2/zstd/blockdec.go:262 +0x98e
  github.com/klauspost/compress/zstd.(*frameDec).runDecoder()
      /home/runner/go/pkg/mod/github.com/klauspost/compress@v1.15.2/zstd/framedec.go:351 +0x22a
  github.com/klauspost/compress/zstd.(*Decoder).DecodeAll()
      /home/runner/go/pkg/mod/github.com/klauspost/compress@v1.15.2/zstd/decoder.go:378 +0x368
  github.com/blugelabs/ice/v2.ZSTDDecompress()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/zstd.go:44 +0xce
  github.com/blugelabs/ice/v2.(*Segment).getDocStoredOffsets()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/read.go:47 +0x2bc
  github.com/blugelabs/ice/v2.(*Segment).getDocStoredMetaAndUnCompressed()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/read.go:22 +0x64
  github.com/blugelabs/ice/v2.(*Segment).visitDocument()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/segment.go:197 +0xca
  github.com/blugelabs/ice/v2.mergeStoredAndRemapSegment()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/merge.go:724 +0x36b
  github.com/blugelabs/ice/v2.mergeStoredAndRemap()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/merge.go:674 +0xa07
  github.com/blugelabs/ice/v2.mergeToWriter()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/merge.go:130 +0x204
  github.com/blugelabs/ice/v2.mergeSegmentBasesWriter()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/merge.go:96 +0x157
  github.com/blugelabs/ice/v2.merge()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/merge.go:85 +0x1d1
  github.com/blugelabs/ice/v2.(*Merger).WriteTo()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/merge.go:48 +0x190
  github.com/blugelabs/bluge/index.(*FileSystemDirectory).Persist()
      /home/runner/work/bluge/bluge/index/directory_fs.go:125 +0x2d4
  github.com/blugelabs/bluge/index.(*Writer).merge()
      /home/runner/work/bluge/bluge/index/merge.go:368 +0x224
  github.com/blugelabs/bluge/index.(*Writer).executeMergeTask()
      /home/runner/work/bluge/bluge/index/merge.go:144 +0x86a
  github.com/blugelabs/bluge/index.(*Writer).planMergeAtSnapshot()
      /home/runner/work/bluge/bluge/index/merge.go:118 +0x474
  github.com/blugelabs/bluge/index.(*Writer).mergerLoop()
      /home/runner/work/bluge/bluge/index/merge.go:56 +0x4c4

Previous read at 0x00c0007b8007 by goroutine 26:
  bytes.(*Reader).ReadByte()
      /opt/hostedtoolcache/go/1.15.15/x64/src/bytes/reader.go:72 +0xe9
  encoding/binary.ReadUvarint()
      /opt/hostedtoolcache/go/1.15.15/x64/src/encoding/binary/varint.go:110 +0x92
  github.com/blugelabs/ice/v2.(*Segment).visitDocument()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/segment.go:217 +0x324
  github.com/blugelabs/ice/v2.(*Segment).VisitStoredFields()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/segment.go:190 +0x104
  github.com/blugelabs/bluge/index.(*segmentSnapshot).VisitDocument()
      /home/runner/work/bluge/bluge/index/segment.go:61 +0x3d4
  github.com/blugelabs/bluge/index.(*Snapshot).VisitStoredFields()
      /home/runner/work/bluge/bluge/index/snapshot.go:307 +0x2a7
  github.com/blugelabs/bluge/search.(*DocumentMatch).VisitStoredFields()
      /home/runner/work/bluge/bluge/search/search.go:134 +0x38e
  github.com/blugelabs/bluge/test.collectHits()
      /home/runner/work/bluge/bluge/test/integration_test.go:46 +0x2ae
  github.com/blugelabs/bluge/test.TestIntegration.func1()
      /home/runner/work/bluge/bluge/test/integration_test.go:150 +0x364
  testing.tRunner()
      /opt/hostedtoolcache/go/1.15.15/x64/src/testing/testing.go:1123 +0x202

Goroutine 14 (running) created at:
  github.com/blugelabs/bluge/index.OpenWriter()
      /home/runner/work/bluge/bluge/index/writer.go:131 +0xf8d
  github.com/blugelabs/bluge.OpenWriter()
      /home/runner/work/bluge/bluge/writer.go:36 +0x137
  github.com/blugelabs/bluge/test.TestIntegration()
      /home/runner/work/bluge/bluge/test/integration_test.go:127 +0x784
  testing.tRunner()
      /opt/hostedtoolcache/go/1.15.15/x64/src/testing/testing.go:1123 +0x202

Goroutine 26 (finished) created at:
  testing.(*T).Run()
      /opt/hostedtoolcache/go/1.15.15/x64/src/testing/testing.go:1168 +0x5bb
  github.com/blugelabs/bluge/test.TestIntegration()
      /home/runner/work/bluge/bluge/test/integration_test.go:141 +0x207
  testing.tRunner()
      /opt/hostedtoolcache/go/1.15.15/x64/src/testing/testing.go:1123 +0x202
==================
==================
WARNING: DATA RACE
Write at 0x00c0007b8010 by goroutine 14:
  runtime.slicecopy()
      /opt/hostedtoolcache/go/1.15.15/x64/src/runtime/slice.go:246 +0x0
  github.com/klauspost/compress/zstd.(*blockDec).decodeCompressed()
      /home/runner/go/pkg/mod/github.com/klauspost/compress@v1.15.2/zstd/blockdec.go:494 +0x207
  github.com/klauspost/compress/zstd.(*blockDec).decodeBuf()
      /home/runner/go/pkg/mod/github.com/klauspost/compress@v1.15.2/zstd/blockdec.go:262 +0x98e
  github.com/klauspost/compress/zstd.(*frameDec).runDecoder()
      /home/runner/go/pkg/mod/github.com/klauspost/compress@v1.15.2/zstd/framedec.go:351 +0x22a
  github.com/klauspost/compress/zstd.(*Decoder).DecodeAll()
      /home/runner/go/pkg/mod/github.com/klauspost/compress@v1.15.2/zstd/decoder.go:378 +0x368
  github.com/blugelabs/ice/v2.ZSTDDecompress()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/zstd.go:44 +0xce
  github.com/blugelabs/ice/v2.(*Segment).getDocStoredOffsets()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/read.go:47 +0x2bc
  github.com/blugelabs/ice/v2.(*Segment).getDocStoredMetaAndUnCompressed()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/read.go:22 +0x64
  github.com/blugelabs/ice/v2.(*Segment).visitDocument()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/segment.go:197 +0xca
  github.com/blugelabs/ice/v2.mergeStoredAndRemapSegment()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/merge.go:724 +0x36b
  github.com/blugelabs/ice/v2.mergeStoredAndRemap()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/merge.go:674 +0xa07
  github.com/blugelabs/ice/v2.mergeToWriter()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/merge.go:130 +0x204
  github.com/blugelabs/ice/v2.mergeSegmentBasesWriter()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/merge.go:96 +0x157
  github.com/blugelabs/ice/v2.merge()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/merge.go:85 +0x1d1
  github.com/blugelabs/ice/v2.(*Merger).WriteTo()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/merge.go:48 +0x190
  github.com/blugelabs/bluge/index.(*FileSystemDirectory).Persist()
      /home/runner/work/bluge/bluge/index/directory_fs.go:125 +0x2d4
  github.com/blugelabs/bluge/index.(*Writer).merge()
      /home/runner/work/bluge/bluge/index/merge.go:368 +0x224
  github.com/blugelabs/bluge/index.(*Writer).executeMergeTask()
      /home/runner/work/bluge/bluge/index/merge.go:144 +0x86a
  github.com/blugelabs/bluge/index.(*Writer).planMergeAtSnapshot()
      /home/runner/work/bluge/bluge/index/merge.go:118 +0x474
  github.com/blugelabs/bluge/index.(*Writer).mergerLoop()
      /home/runner/work/bluge/bluge/index/merge.go:56 +0x4c4

Previous write at 0x00c0007b8010 by goroutine 26:
  runtime.slicecopy()
      /opt/hostedtoolcache/go/1.15.15/x64/src/runtime/slice.go:246 +0x0
  github.com/klauspost/compress/zstd.(*blockDec).decodeCompressed()
      /home/runner/go/pkg/mod/github.com/klauspost/compress@v1.15.2/zstd/blockdec.go:494 +0x207
  github.com/klauspost/compress/zstd.(*blockDec).decodeBuf()
      /home/runner/go/pkg/mod/github.com/klauspost/compress@v1.15.2/zstd/blockdec.go:262 +0x98e
  github.com/klauspost/compress/zstd.(*frameDec).runDecoder()
      /home/runner/go/pkg/mod/github.com/klauspost/compress@v1.15.2/zstd/framedec.go:351 +0x22a
  github.com/klauspost/compress/zstd.(*Decoder).DecodeAll()
      /home/runner/go/pkg/mod/github.com/klauspost/compress@v1.15.2/zstd/decoder.go:378 +0x368
  github.com/blugelabs/ice/v2.ZSTDDecompress()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/zstd.go:44 +0xce
  github.com/blugelabs/ice/v2.(*Segment).getDocStoredOffsets()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/read.go:47 +0x2bc
  github.com/blugelabs/ice/v2.(*Segment).getDocStoredMetaAndUnCompressed()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/read.go:22 +0x64
  github.com/blugelabs/ice/v2.(*Segment).visitDocument()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/segment.go:197 +0xca
  github.com/blugelabs/ice/v2.(*Segment).VisitStoredFields()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/segment.go:190 +0x104
  github.com/blugelabs/bluge/index.(*segmentSnapshot).VisitDocument()
      /home/runner/work/bluge/bluge/index/segment.go:61 +0x3d4
  github.com/blugelabs/bluge/index.(*Snapshot).VisitStoredFields()
      /home/runner/work/bluge/bluge/index/snapshot.go:307 +0x2a7
  github.com/blugelabs/bluge/search.(*DocumentMatch).VisitStoredFields()
      /home/runner/work/bluge/bluge/search/search.go:134 +0x38e
  github.com/blugelabs/bluge/test.collectHits()
      /home/runner/work/bluge/bluge/test/integration_test.go:46 +0x2ae
  github.com/blugelabs/bluge/test.TestIntegration.func1()
      /home/runner/work/bluge/bluge/test/integration_test.go:150 +0x364
  testing.tRunner()
      /opt/hostedtoolcache/go/1.15.15/x64/src/testing/testing.go:1123 +0x202

Goroutine 14 (running) created at:
  github.com/blugelabs/bluge/index.OpenWriter()
      /home/runner/work/bluge/bluge/index/writer.go:131 +0xf8d
  github.com/blugelabs/bluge.OpenWriter()
      /home/runner/work/bluge/bluge/writer.go:36 +0x137
  github.com/blugelabs/bluge/test.TestIntegration()
      /home/runner/work/bluge/bluge/test/integration_test.go:127 +0x784
  testing.tRunner()
      /opt/hostedtoolcache/go/1.15.15/x64/src/testing/testing.go:1123 +0x202

Goroutine 26 (finished) created at:
  testing.(*T).Run()
      /opt/hostedtoolcache/go/1.15.15/x64/src/testing/testing.go:1168 +0x5bb
  github.com/blugelabs/bluge/test.TestIntegration()
      /home/runner/work/bluge/bluge/test/integration_test.go:141 +0x207
  testing.tRunner()
      /opt/hostedtoolcache/go/1.15.15/x64/src/testing/testing.go:1123 +0x202
==================
==================
WARNING: DATA RACE
Write at 0x00c0007b8020 by goroutine 14:
  runtime.slicecopy()
      /opt/hostedtoolcache/go/1.15.15/x64/src/runtime/slice.go:246 +0x0
  github.com/klauspost/compress/zstd.(*blockDec).decodeCompressed()
      /home/runner/go/pkg/mod/github.com/klauspost/compress@v1.15.2/zstd/blockdec.go:494 +0x207
  github.com/klauspost/compress/zstd.(*blockDec).decodeBuf()
      /home/runner/go/pkg/mod/github.com/klauspost/compress@v1.15.2/zstd/blockdec.go:262 +0x98e
  github.com/klauspost/compress/zstd.(*frameDec).runDecoder()
      /home/runner/go/pkg/mod/github.com/klauspost/compress@v1.15.2/zstd/framedec.go:351 +0x22a
  github.com/klauspost/compress/zstd.(*Decoder).DecodeAll()
      /home/runner/go/pkg/mod/github.com/klauspost/compress@v1.15.2/zstd/decoder.go:378 +0x368
  github.com/blugelabs/ice/v2.ZSTDDecompress()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/zstd.go:44 +0xce
  github.com/blugelabs/ice/v2.(*Segment).getDocStoredOffsets()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/read.go:47 +0x2bc
  github.com/blugelabs/ice/v2.(*Segment).getDocStoredMetaAndUnCompressed()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/read.go:22 +0x64
  github.com/blugelabs/ice/v2.(*Segment).visitDocument()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/segment.go:197 +0xca
  github.com/blugelabs/ice/v2.mergeStoredAndRemapSegment()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/merge.go:724 +0x36b
  github.com/blugelabs/ice/v2.mergeStoredAndRemap()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/merge.go:674 +0xa07
  github.com/blugelabs/ice/v2.mergeToWriter()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/merge.go:130 +0x204
  github.com/blugelabs/ice/v2.mergeSegmentBasesWriter()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/merge.go:96 +0x157
  github.com/blugelabs/ice/v2.merge()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/merge.go:85 +0x1d1
  github.com/blugelabs/ice/v2.(*Merger).WriteTo()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/merge.go:48 +0x190
  github.com/blugelabs/bluge/index.(*FileSystemDirectory).Persist()
      /home/runner/work/bluge/bluge/index/directory_fs.go:125 +0x2d4
  github.com/blugelabs/bluge/index.(*Writer).merge()
      /home/runner/work/bluge/bluge/index/merge.go:368 +0x224
  github.com/blugelabs/bluge/index.(*Writer).executeMergeTask()
      /home/runner/work/bluge/bluge/index/merge.go:144 +0x86a
  github.com/blugelabs/bluge/index.(*Writer).planMergeAtSnapshot()
      /home/runner/work/bluge/bluge/index/merge.go:118 +0x474
  github.com/blugelabs/bluge/index.(*Writer).mergerLoop()
      /home/runner/work/bluge/bluge/index/merge.go:56 +0x4c4

Previous read at 0x00c0007b8026 by goroutine 26:
  runtime.slicecopy()
      /opt/hostedtoolcache/go/1.15.15/x64/src/runtime/slice.go:246 +0x0
  github.com/blugelabs/bluge/test.collectHits.func1()
      /home/runner/work/bluge/bluge/test/integration_test.go:48 +0xb8
  github.com/blugelabs/bluge/index.(*Snapshot).VisitStoredFields.func1()
      /home/runner/work/bluge/bluge/index/snapshot.go:308 +0x72
  github.com/blugelabs/ice/v2.(*Segment).visitDocument()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/segment.go:223 +0x242
  github.com/blugelabs/ice/v2.(*Segment).VisitStoredFields()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/segment.go:190 +0x104
  github.com/blugelabs/bluge/index.(*segmentSnapshot).VisitDocument()
      /home/runner/work/bluge/bluge/index/segment.go:61 +0x3d4
  github.com/blugelabs/bluge/index.(*Snapshot).VisitStoredFields()
      /home/runner/work/bluge/bluge/index/snapshot.go:307 +0x2a7
  github.com/blugelabs/bluge/search.(*DocumentMatch).VisitStoredFields()
      /home/runner/work/bluge/bluge/search/search.go:134 +0x38e
  github.com/blugelabs/bluge/test.collectHits()
      /home/runner/work/bluge/bluge/test/integration_test.go:46 +0x2ae
  github.com/blugelabs/bluge/test.TestIntegration.func1()
      /home/runner/work/bluge/bluge/test/integration_test.go:150 +0x364
  testing.tRunner()
      /opt/hostedtoolcache/go/1.15.15/x64/src/testing/testing.go:1123 +0x202

Goroutine 14 (running) created at:
  github.com/blugelabs/bluge/index.OpenWriter()
      /home/runner/work/bluge/bluge/index/writer.go:131 +0xf8d
  github.com/blugelabs/bluge.OpenWriter()
      /home/runner/work/bluge/bluge/writer.go:36 +0x137
  github.com/blugelabs/bluge/test.TestIntegration()
      /home/runner/work/bluge/bluge/test/integration_test.go:127 +0x784
  testing.tRunner()
      /opt/hostedtoolcache/go/1.15.15/x64/src/testing/testing.go:1123 +0x202

Goroutine 26 (finished) created at:
  testing.(*T).Run()
      /opt/hostedtoolcache/go/1.15.15/x64/src/testing/testing.go:1168 +0x5bb
  github.com/blugelabs/bluge/test.TestIntegration()
      /home/runner/work/bluge/bluge/test/integration_test.go:141 +0x207
  testing.tRunner()
      /opt/hostedtoolcache/go/1.15.15/x64/src/testing/testing.go:1123 +0x202
==================
==================
WARNING: DATA RACE
Write at 0x00c0007b8030 by goroutine 14:
  runtime.slicecopy()
      /opt/hostedtoolcache/go/1.15.15/x64/src/runtime/slice.go:246 +0x0
  github.com/klauspost/compress/zstd.(*blockDec).decodeCompressed()
      /home/runner/go/pkg/mod/github.com/klauspost/compress@v1.15.2/zstd/blockdec.go:494 +0x207
  github.com/klauspost/compress/zstd.(*blockDec).decodeBuf()
      /home/runner/go/pkg/mod/github.com/klauspost/compress@v1.15.2/zstd/blockdec.go:262 +0x98e
  github.com/klauspost/compress/zstd.(*frameDec).runDecoder()
      /home/runner/go/pkg/mod/github.com/klauspost/compress@v1.15.2/zstd/framedec.go:351 +0x22a
  github.com/klauspost/compress/zstd.(*Decoder).DecodeAll()
      /home/runner/go/pkg/mod/github.com/klauspost/compress@v1.15.2/zstd/decoder.go:378 +0x368
  github.com/blugelabs/ice/v2.ZSTDDecompress()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/zstd.go:44 +0xce
  github.com/blugelabs/ice/v2.(*Segment).getDocStoredOffsets()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/read.go:47 +0x2bc
  github.com/blugelabs/ice/v2.(*Segment).getDocStoredMetaAndUnCompressed()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/read.go:22 +0x64
  github.com/blugelabs/ice/v2.(*Segment).visitDocument()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/segment.go:197 +0xca
  github.com/blugelabs/ice/v2.mergeStoredAndRemapSegment()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/merge.go:724 +0x36b
  github.com/blugelabs/ice/v2.mergeStoredAndRemap()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/merge.go:674 +0xa07
  github.com/blugelabs/ice/v2.mergeToWriter()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/merge.go:130 +0x204
  github.com/blugelabs/ice/v2.mergeSegmentBasesWriter()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/merge.go:96 +0x157
  github.com/blugelabs/ice/v2.merge()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/merge.go:85 +0x1d1
  github.com/blugelabs/ice/v2.(*Merger).WriteTo()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/merge.go:48 +0x190
  github.com/blugelabs/bluge/index.(*FileSystemDirectory).Persist()
      /home/runner/work/bluge/bluge/index/directory_fs.go:125 +0x2d4
  github.com/blugelabs/bluge/index.(*Writer).merge()
      /home/runner/work/bluge/bluge/index/merge.go:368 +0x224
  github.com/blugelabs/bluge/index.(*Writer).executeMergeTask()
      /home/runner/work/bluge/bluge/index/merge.go:144 +0x86a
  github.com/blugelabs/bluge/index.(*Writer).planMergeAtSnapshot()
      /home/runner/work/bluge/bluge/index/merge.go:118 +0x474
  github.com/blugelabs/bluge/index.(*Writer).mergerLoop()
      /home/runner/work/bluge/bluge/index/merge.go:56 +0x4c4

Previous read at 0x00c0007b8033 by goroutine 26:
  bytes.(*Reader).ReadByte()
      /opt/hostedtoolcache/go/1.15.15/x64/src/bytes/reader.go:72 +0xe9
  encoding/binary.ReadUvarint()
      /opt/hostedtoolcache/go/1.15.15/x64/src/encoding/binary/varint.go:110 +0x92
  github.com/blugelabs/ice/v2.(*Segment).visitDocument()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/segment.go:213 +0x2e5
  github.com/blugelabs/ice/v2.(*Segment).VisitStoredFields()
      /home/runner/go/pkg/mod/github.com/blugelabs/ice/v2@v2.0.1/segment.go:190 +0x104
  github.com/blugelabs/bluge/index.(*segmentSnapshot).VisitDocument()
      /home/runner/work/bluge/bluge/index/segment.go:61 +0x3d4
  github.com/blugelabs/bluge/index.(*Snapshot).VisitStoredFields()
      /home/runner/work/bluge/bluge/index/snapshot.go:307 +0x2a7
  github.com/blugelabs/bluge/search.(*DocumentMatch).VisitStoredFields()
      /home/runner/work/bluge/bluge/search/search.go:134 +0x38e
  github.com/blugelabs/bluge/test.collectHits()
      /home/runner/work/bluge/bluge/test/integration_test.go:46 +0x2ae
  github.com/blugelabs/bluge/test.TestIntegration.func1()
      /home/runner/work/bluge/bluge/test/integration_test.go:150 +0x364
  testing.tRunner()
      /opt/hostedtoolcache/go/1.15.15/x64/src/testing/testing.go:1123 +0x202

Goroutine 14 (running) created at:
  github.com/blugelabs/bluge/index.OpenWriter()
      /home/runner/work/bluge/bluge/index/writer.go:131 +0xf8d
  github.com/blugelabs/bluge.OpenWriter()
      /home/runner/work/bluge/bluge/writer.go:36 +0x137
  github.com/blugelabs/bluge/test.TestIntegration()
      /home/runner/work/bluge/bluge/test/integration_test.go:127 +0x784
  testing.tRunner()
      /opt/hostedtoolcache/go/1.15.15/x64/src/testing/testing.go:1123 +0x202

Goroutine 26 (finished) created at:
  testing.(*T).Run()
      /opt/hostedtoolcache/go/1.15.15/x64/src/testing/testing.go:1168 +0x5bb
  github.com/blugelabs/bluge/test.TestIntegration()
      /home/runner/work/bluge/bluge/test/integration_test.go:141 +0x207
  testing.tRunner()
      /opt/hostedtoolcache/go/1.15.15/x64/src/testing/testing.go:1123 +0x202
==================
--- FAIL: TestIntegration (3.05s)
Error:     integration_test.go:117: testdir: /tmp/bluge-integration-test-basic405355639
    --- FAIL: TestIntegration/basic-test_numeric_range,_upper_and_lower_bounds (0.00s)
Error:         testing.go:1038: race detected during execution of test
Error:     integration_test.go:117: testdir: /tmp/bluge-integration-test-sort966111594
Error:     integration_test.go:117: testdir: /tmp/bluge-integration-test-fosdem554906049
Error:     integration_test.go:117: testdir: /tmp/bluge-integration-test-geo813274668
Error:     integration_test.go:117: testdir: /tmp/bluge-integration-test-phrase008033179
Error:     integration_test.go:117: testdir: /tmp/bluge-integration-test-aggregations042487358
Error:     testing.go:1038: race detected during execution of test
FAIL
FAIL    github.com/blugelabs/bluge/test 3.118s
FAIL
Error: Process completed with exit code 1.
mschoch commented 2 years ago

cc @hengfeiyang

mschoch commented 2 years ago

@hengfeiyang I think the issue is that Segments are shared, so when you mutate storedFieldChunkUncompressed we have potential data races:

https://github.com/blugelabs/ice/blob/d830a812e60591ce0955fdeabd483f0ebf537ebd/read.go#L46-L47

If we must do it this way, you'll have to protect access to it with a mutex, like the fieldFSTs:

https://github.com/blugelabs/ice/blob/master/segment.go#L52-L53

Alternatively I wonder, can't we arrange this so that we decompress once, and then just reuse it? I'm not sure exactly how that code looks to be safe from races, but it doesn't make sense that we'd ever intentionally decompress the same compressed bytes again right?

mschoch commented 2 years ago

Seems like there are 2 choices:

  1. uncompress at open (wasteful if never match documents in this segment), but can avoid a lock.
  2. uncompress on first use, needs a lock (possibly too much overhead, because we always need to load _id for matches)

You could also uncompress every time without saving, but pretty sure that isn't a useful option.

mschoch commented 2 years ago

Oh I see, it's actually a bigger problem. You only uncompress one chunk at a time, so the contents of that buffer actually changes depending on what documents you try to access.

In that case I think we can't cache/reuse the buffer inside the segment. They are intended to be heavily shared (you could have a hundred queries all hitting that segment at one time), so I don't think it makes sense to share that buffer and coordinate access with a lock.

And that makes it seem like the current API isn't going to work very well with the stored docs compressed in chunks. I believe that was the reason we compressed docs individually in the past, even though it results in much lower compression. So, I think if you want to go with this storage format, you should also propose an API to access it efficiently.

mschoch commented 2 years ago

I prototyped one idea here: https://github.com/blugelabs/ice/pull/15 But I don't love it.

hengfeiyang commented 2 years ago

I prototyped one idea here: blugelabs/ice#15 But I don't love it.

I modify Mutex to RWMutex, This will be slight, but it would can cause problem when load many many segments, need a mechanism to release unused cache.

https://github.com/blugelabs/ice/pull/16