Closed thibault-formal closed 5 months ago
Hey Thibault! Hope you’re well.
I can check but basically we return k, but we compute exact scores for a larger number than k
Hey Omar! I hope you are well too! I see -- so basically
Just checking that I got things correctly, as I have been working on related stuff :) Thanks
This sounds right, but @santhnm2 might be able to confirm too
Yes this is correct, this function is where we choose the hyperparameters according to k
: https://github.com/stanford-futuredata/ColBERT/blob/main/colbert/searcher.py#L88
And here is where the number of exact scores is computed: https://github.com/stanford-futuredata/ColBERT/blob/fc3ce55d2a6c993367b176d183e0a81c6d6de8d4/colbert/search/index_storage.py#L152
Perfect, thank you both for the quick answer!
Hi again,
I have another (unrelated) question regarding PLAID: did you evaluate the performance on the BEIR benchmark? Could there be a performance drop (OOD) due to the approximation?
EDIT: I saw the Lotte results (apparently no drop) but I wonder if it's also true on BEIR
Thanks
Hi @okhat and all!
In your PLAID paper (very nice btw!), I cannot understand the difference between ndocs/4 (which seems to be the number of documents after Stage 3) and k (which seems to be the number of documents to exactly re-rank in Stage 4). In the end, for how many documents true relevance scores are computed? Thx in advance! Thibault