meilisearch / milli

Search engine library for Meilisearch ⚡️
MIT License
464 stars 81 forks source link

Fix bug in handling of soft deleted documents when updating settings #723

Closed loiclec closed 1 year ago

loiclec commented 1 year ago

Pull Request

Related issue

Fixes (partially, until merged into meilisearch) https://github.com/meilisearch/meilisearch/issues/3021

What does this PR do?

This PR fixes the bug where a missing key in documents database internal error message could appear when indexing documents.

When updating the settings, before clearing the database and before creating the transform output, we now modify the ExternalDocumentsIds structure to get rid of all references to soft deleted document ids in its FSTs.

It used to be that updating the settings would clear the soft-deleted document ids, but keep the original ExternalDocumentsIds structure. As a consequence of this, when processing a future document addition, we could wrongly believe that a document was being replaced when, in fact, it was a completely new document. See the tests bug_3021_first, bug_3021_second, and bug_3021 for a minimal test case that would have reproduced the issue.

We need to take special care to:

Kerollmops commented 1 year ago

Let's merge this! bors merge

bors[bot] commented 1 year ago

Build succeeded: