quickwit-oss / tantivy

Tantivy is a full-text search engine library inspired by Apache Lucene and written in Rust
MIT License
11.41k stars 627 forks source link

Concurrent commit failed in multi-process environment #2418

Closed cyccbxhl closed 1 month ago

cyccbxhl commented 1 month ago

I implemented the code according to #2408 , and found an unexpected situation: There are two parallel PostgreSQL backend processes, each adding a few documents. When the first process commits, it succeeds, but when the subsequent process attempts to commit, there are three possible outcomes:

  1. Most likely, it will fail with the error message "TantivyError(OpenReadError(FileDoesNotExist('.../tantivy/c09b2bdff65a4c4aa64394d180932812.fieldnorm')))".
  2. Occasionally, it will succeed and data will be written successfully.
  3. It won't throw an error, but some of the data will be invisible (indicating write failure).

The modification I made to tantivy::IndexWriter was to remove the logic related to INDEX_WRITER_LOCK upon initialization of this object; everything else remained unchanged. Moreover, my understanding is that the fieldnorm file should be related to the segment of the current transaction. Why would committing in other concurrent processes affect the fieldnorm of this transaction?

fulmicoton commented 1 month ago

This is not a bug so I will close this. Kudos for doing complicated stuff! but that also mean you are on your own to solve these problems.