stanford-futuredata / ColBERT

ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)
MIT License
2.95k stars 377 forks source link

[incremental indexing] - IndexUpdater.update_searcher is successful but search result does not show newly indexed passage #256

Closed jessiejuachon closed 1 year ago

jessiejuachon commented 1 year ago

Test environment:

Steps to reproduce:

  1. Initialize the Searcher, passing the index path pointing to the the 12 GB index. searcher = Searcher(checkpoint=self._checkpoint, index=self._index_path, collection=Collection(path="empty.tsv"), config=config)
  2. Initialize the IndexUpdater index_updater = IndexUpdater(config=config, searcher=searcher, checkpoint=self._checkpoint)
  3. Call IndexUpdater.update_searcher (or IndexUpdater.add) passing a new passage
  4. Call Searcher.search querying for the newly added passage.

Result: The newly added passage does not appear in the result. Expected result: The newly added passage must appear in the search result with the highest score.

Notes:

jessiejuachon commented 1 year ago

Issue was with the test environment: creating embeddings using an index that is different from the one that the Searcher loaded.