Training by dot product and evaluation via inverted index?

jordane95 commented 2 years ago

Hey, I recently read your SPLADEv2 paper. That's so insightful! But I still have a few questions about it.

Is the model trained with dot product similarity function included in the contrastive loss?
Evaluation on MS MARCO is performed via inverted index backed by anserine?
Evaluation on BEIR is implemented with sentencetransformer hence also via dot product?
How much can you gurantee the sparsity of learned representation since it's softly regularized by L1 and FLOPS loss? Did you use a tuned threshold to ''zerofy'' ~0 value?

thibault-formal commented 2 years ago

Hey, Thanks for your interest in SPLADE !

The model is indeed trained with dot product
We have some internal code which relies on custom inverted index with numba for indexing and retrieval; we should release it soon. But as you said, you can also use Anserini, following the guidelines here. Note that the two arguments -impact -pretokenized tell anserini to use dot product.
Yes; we should also refactor this part too
Note that the model relies on a ReLU activation; thus, training with L1/FLOPS directly makes the representations sparse (hence, no need for a threshold) => one forward with the model outputs sparse representations

Best, Thibault

jordane95 commented 2 years ago

That's clear. Thank you for your reply.

naver / splade