Closed lboesen closed 1 year ago
Not the author but we can report that when using the MSMARCO checkpoint with a collection of size ~20M passages, 128d, 2bit quantization, we get an index size of 77GB.
Thanks for your comment ! Im guessing then with the msmarco collection of ~8.8M passages with similar params will be around 40GB.
@lboesen Which branch are you using?
The v1 index sizes are in https://arxiv.org/abs/2004.12832 (ms marco ~140GB iirc)
The v2 (main) index sizes are in https://arxiv.org/abs/2205.09707 (ms marco with nbits=2 ~22GB)
Hope this helps. Closing.
Hi,
First of all, great work!
I wanted to hear if you have an idea of the size of the resulting index, when using MSMARCO(v1) passage ranking dataset ?