quickwit-oss / tantivy

Tantivy is a full-text search engine library inspired by Apache Lucene and written in Rust
MIT License
11.79k stars 653 forks source link

Investigate empty segment merge #1189

Closed PSeitz closed 2 years ago

PSeitz commented 2 years ago

An issue occurred where merge after commit selected only empty segments.

thread 'merge_thread1' panicked at 'Unexpected error, empty readers in IndexMerger', src/indexer/merger.rs:330:16
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread 'merge_thread1' panicked at 'You forgot to flush "00004481000000000000000000000000.term" before its writter got Drop. Do not rely on drop. This also occurs when the indexer crashed, so you may want to
 check the logs for the root cause.', src/directory/ram_directory.rs:49:13
stack backtrace:

Investigate if checking for non-empty segments before triggering a merge is sufficient. There are also different possible scenarios, when we end up with an empty segment:

To Reproduce

Change test_functional_indexing_sorted to run with 15threads

let mut index_writer = index.writer_with_num_threads(15, 150_000_000)?;

Maybe run in a loop or multiple times.

fulmicoton commented 2 years ago

@PSeitz I don't remember this ... Is this something you can reproduce relatively easily?

PSeitz commented 2 years ago

Yes, it's easy to reproduce, but I don't get the stacktrace anymore. cargo nextest seems to behave more consistent there.

NUM_FUNCTIONAL_TEST_ITERATIONS=2000000 cargo test indexing_sorted  -- --ignored

running 1 test
error: test failed, to rerun pass '--lib'

Caused by:
  process didn't exit successfully: `/home/pascal/LinuxData/Development/tantivy/gcd_encoding/target/debug/deps/tantivy-016729acb4e831a2 indexing_sorted --ignored` (signal: 6, SIGABRT: proc
ess abort signal)
NUM_FUNCTIONAL_TEST_ITERATIONS=2000000 cargo nextest run indexing_sorted --run-ignored all
   Compiling tantivy v0.18.0 (/home/pascal/LinuxData/Development/tantivy/gcd_encoding)
    Finished test [unoptimized + debuginfo] target(s) in 14.07s
  Executable unittests src/lib.rs (target/debug/deps/tantivy-ccc74d3207250cdf)
  Executable tests/failpoints/mod.rs (target/debug/deps/failpoints-4ab25e264782755f)
  Executable tests/mod.rs (target/debug/deps/mod-49b9839018f790e3)
    Starting 1 tests across 3 binaries (672 skipped)
        SLOW [> 60.000s]             tantivy functional_test::test_functional_indexing_sorted
        FAIL [  98.674s]             tantivy functional_test::test_functional_indexing_sorted

--- STDOUT:                          tantivy functional_test::test_functional_indexing_sorted ---

running 1 test
test functional_test::test_functional_indexing_sorted has been running for over 60 seconds

--- STDERR:                          tantivy functional_test::test_functional_indexing_sorted ---
thread 'merge_thread_1' panicked at 'Unexpected error, empty readers in IndexMerger', src/indexer/merger.rs:353:14
stack backtrace: