Evaluation on MSMARCO? - Githubissues

naver / splade

SPLADE: sparse neural search (SIGIR21, SIGIR22)

Other

737 stars 80 forks source link

Evaluation on MSMARCO? #9

Closed thongnt99 closed 2 years ago

thongnt99 commented 2 years ago

Hi, thanks for your very interesting work.

Could you share how you evaluate to get the results here. Did you use inverted indexing or use this code? I am trying the later approach, but it is very slow on MSMARCO. Thank you

thibault-formal commented 2 years ago

Hi @thongnt99 Thanks for your interest !

We get the results using a custom implementation of inverted index relying on numba for intra-query parallelization (@cadurosar ). The code is not released yet, but we plan to in the coming weeks. I am afraid this code will be indeed slow to evaluate on MS MARCO. Alternatively, you can evaluate with pyserini following this: https://github.com/castorini/anserini/blob/master/docs/experiments-msmarco-passage-splade-v2.md (you only need to create the query/document files with SPLADE weights; we also plan to release this conversion step soon, but it's not too difficult, just let us know if you need some help). Best, Thibault

thongnt99 commented 2 years ago

Thanks. Looking forward to your future release.

thongnt99 commented 2 years ago

Hi @thibault-formal , I wonder how the term weights were normalized? I think it was multiplied with 100 and rounded to the closet integer?

cadurosar commented 2 years ago

Hi @thibault-formal , I wonder how the term weights were normalized? I think it was multiplied with 100 and rounded to the closet integer?

Hi @thongnt99, you are correct, we have converted term weights to integer in that way.

thongnt99 commented 2 years ago

Thanks. I ask because it doesn't match with the weights generated by my code. There seems to be a small mistake in this code. self.agg = torch.sum should be self.agg = agg

thibault-formal commented 2 years ago

hi @thongnt99 indeed, it was a silly mistake I made when creating the notebook ! It is fixed now, thanks for spotting it Best, Thibault

thongnt99 commented 2 years ago

Hi, just in case it is useful, I re-trained the distilsplade_v2 and re-generated the term weights myself, it only took less than 4 minutes for searching with pyserini. (MRR@10 = 0.3744). On the same machine, it took 11 minutes with the term weights provided here.

cadurosar commented 2 years ago

Sorry, it took as a while, but now we have released instructions and code for evaluating on msmarco with anserini. We also released some new models :) https://github.com/naver/splade/tree/main/anserini_evaluation