stanford-futuredata / ColBERT

ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)
MIT License
2.95k stars 377 forks source link

[IndexUpdater.persist_to_disk] Incorrect values of variables for append #246

Closed jessiejuachon closed 1 year ago

jessiejuachon commented 1 year ago

The following variable musr be reset after every call to persist_to_disk.

variables to track removal / append of passages

    self.removed_pids
    self.first_new_emb
    self.first_new_pid

Otherwise, pids and embeddings placeholders increase exponentially.