[IndexUpdater] Minimizing calls to torch.Tensor.tolist() and torch.Tensor() for performance optimization

stanford-futuredata / ColBERT

ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)

MIT License

2.95k stars 377 forks source link

[IndexUpdater] Minimizing calls to torch.Tensor.tolist() and torch.Tensor() for performance optimization #253

Closed jessiejuachon closed 1 year ago

jessiejuachon commented 1 year ago

For a huge index, self.curr_ivf.tolist() takes a long time. Calling it only once per IndexUpdater.update_searcher call instead of per pid improves the performance significantly.

jessiejuachon commented 1 year ago

Hi @santhnm2 ! Following up on this one. Thanks!