Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.
Apache License 2.0
3.03k
stars
206
forks
source link
How to index collection using generator function? #220
I have a large collection of 16 million passages that I want to index. It's not practical to keep all documents and ids in memory as a list to pass it to the index function. Is there a way to index large collections using generators?
I have a large collection of 16 million passages that I want to index. It's not practical to keep all documents and ids in memory as a list to pass it to the index function. Is there a way to index large collections using generators?