Closed svakulenk0 closed 3 years ago
Hi @svakulenk0,
Sadly at the moment, the only solution is to pickle save the document embeddings.
Recently, I have been working on getting faiss indexes to be integrated. That would allow caching or saving corpus embeddings as a faiss index. I can't say a fixed timeline of when this will be completely integrated into the repo, but I will let you know once it is done.
Kind Regards, Nandan
Hi Nandan, thank you for the reply! I love the library :)
Hi @svakulenk0,
Update: In the latest version of the BEIR package, now you can save/load the corpus embeddings as a faiss index. Check out: https://github.com/UKPLab/beir/blob/main/examples/retrieval/evaluation/dense/evaluate_faiss_dense.py
Kind Regards, Nandan
nice!!! thank you
Is there a way to cache/load embedded documents and queries? That would help to save time on embedding big datasets such as ms marco and nq
Thanks Svitlana for this question and Nandan for providing this feature, it is beneficial!
nice!!! thank you
Hi @svakulenk0, have you tried this?
Hi @svakulenk0,
Update: In the latest version of the BEIR package, now you can save/load the corpus embeddings as a faiss index. Check out: https://github.com/UKPLab/beir/blob/main/examples/retrieval/evaluation/dense/evaluate_faiss_dense.py
Kind Regards, Nandan
Hi @thakur-nandan, thank you for this. Can you please give me an example of how to store and load the embeddings?
Is there a way to cache/load embedded documents and queries? That would help to save time on embedding big datasets such as ms marco and nq