Closed cramraj8 closed 9 months ago
@MXueguang Thank you!
@MXueguang does the new branch (main branch) currently incorporates both mean pooling and temperature in the loss?
--pooling mean \ --temperature xxx
yes. you can set these two argument during training to match the contriever setup.
Sounds good. It's already working great for me! I found from another thread that temperature is a sensitive parameter during contriever training. I also found with different batch size & temperature values, the performance substantially improves or drops. Any idea why and any optimum values you found? I am experimenting with multilingual datasets, TyDi and MIRACL.
@crystina-z can comments more here.
Is there any T5 encoder based Contriever pre-trained/ fine-tuned model available in HuggingFace? I would like to see the performance boost.
Hi @cramraj8 we actually didn't actively tune batch size and temperature. In the final config, we use batch=128 and the default temperature. You'll likely find a setting outperforming this, please open a PR if you do!
Yes, but need some modification.
https://github.com/texttron/tevatron/blob/2e5d00ee21d5a7db0bd2ea1463c9150a572106d4/src/tevatron/modeling/dense.py#L33 need to change cls token representation to average pooling representation.
when computing loss, need to add a temperature on the score.