Closed finegan-dollak closed 1 year ago
Hi, thanks for your interest.
I wonder how you were evaluating the cross-encoders? Since it is of a different formulation, one needs to concatenate a sentence pair into one string and input it into the model. Specifically, you can use our script:
>> python src/eval.py \
--model_name_or_path "cambridgeltl/trans-encoder-cross-simcse-roberta-large" \
--mode cross \
--task sts_sickr
as mentioned in the readme (where mode
specifies whether to evaluate in bi-encoder or cross-encode formulation).
Hope this is helpful.
That was absolutely the problem; thank you!
Hi there, I find this work very interesting, and I was trying to replicate your results using the models you've shared on Huggingface. The bi-encoder models are behaving as expected; however, the cross-encoders are getting much lower scores than I expect on STS (results in the 30s-40s rather than 70s to 80s), which makes me think I'm missing a step.
Should the Huggingface pretrained models for STS work out of the box, or do I need to fine-tune them on the train set for each STS dataset?
The models at issue are:
Thanks for any advice you can give!