Closed ManyTheFish closed 2 years ago
When doing a PHRASE search containing several times the same word, no results are returned by Meilisearch.
1) push some documents containing several times the same word together:
$ curl \ -X POST 'http://localhost:7700/indexes/movies/documents' \ -H 'Content-Type: application/json' \ --data-binary '[{"id": 1, "title": "knock knock"}]'
2) Make a PHRASE search query containing duplicates:
$ curl \ -X POST 'http://localhost:7700/indexes/movies/search' \ -H 'Content-Type: application/json' \ --data-binary '{ "q": "\"knock knock\"" }'
3) Meilisearch should return the document
This Bug comes from the indexing part of the code when we compute the word_pair_proximity_docids database in src/update/index_documents/extract/extract_word_pair_proximity_docids.rs. In document_word_positions_into_sorter we forgot to extract the proximity of the current position of the current word with the next position of it.
word_pair_proximity_docids
document_word_positions_into_sorter
During the increase of the current word position we could extract the proximity between the current position and the next one.
When doing a PHRASE search containing several times the same word, no results are returned by Meilisearch.
Step to reproduce
1) push some documents containing several times the same word together:
2) Make a PHRASE search query containing duplicates:
3) Meilisearch should return the document
Possible Fix
This Bug comes from the indexing part of the code when we compute the
word_pair_proximity_docids
database in src/update/index_documents/extract/extract_word_pair_proximity_docids.rs. Indocument_word_positions_into_sorter
we forgot to extract the proximity of the current position of the current word with the next position of it.During the increase of the current word position we could extract the proximity between the current position and the next one.
Files expected to be modified