stanford-futuredata / ColBERT

ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)
MIT License
2.67k stars 355 forks source link

How to insert new document into the pre-built index? #335

Open pursuemoon opened 2 months ago

pursuemoon commented 2 months ago

If I have pre-built an index of data and a new document needs to be indexed, is there a way to insert the new document into the existing index so that it becomes a new index?

jlscheerer commented 1 month ago

I believe you can use the IndexUpdater in that case @pursuemoon.

from colbert import IndexUpdater
index_updater = IndexUpdater(config, searcher, checkpoint)
index_updater.add(new_document_collection)

If you want to persist the changes call index_updater.persist_to_disk()