Is sentence-transformer's distilbert-base-nli-stsb-mean-tokens model is same as stsb-distilbert-base model or stsb-distilroberta-base-v2?

UKPLab / sentence-transformers

State-of-the-Art Text Embeddings

https://www.sbert.net

Apache License 2.0

15.41k stars 2.49k forks source link

Is sentence-transformer's distilbert-base-nli-stsb-mean-tokens model is same as stsb-distilbert-base model or stsb-distilroberta-base-v2? #981

Open MyBruso opened 3 years ago

MyBruso commented 3 years ago

Hello @nreimers, I am wondering if below three models are exactly same and is it just the name change for distilbert-base-nli-stsb-mean-tokens model?

distilbert-base-nli-stsb-mean-tokens
stsb-distilbert-base model
stsb-distilroberta-base-v2

I see README.md was updated on 11th Jan 2021 to remove distilbert-base-nli-stsb-mean-tokens from the Performance list and stsb-distilbert-base model was added with same accuracy. Later on stsb-distilbert-base model is also removed from this list (on 1st May 2021)

So what is latest name of this model distilbert-base-nli-stsb-mean-tokens and is there any specific reason for its removal? Which other model will you suggest for deriving sentence embeddings in place of this model?

nreimers commented 3 years ago

All 3 models can still be used. Model 1 and 2 are identical, just the name changed.

Model 3 used a different, improved training procedure. Also it is based on a different network architecture.

I can recommend to change the latest paraphrase models you find here https://sbert.net/docs/pretrained_models.html

They are way better than the stsb-distilbert-base model.

MyBruso commented 3 years ago

Thank you @nreimers for clarifying this change, I will check the list shared by you.

I revisited this README Performance list for two things -

Light weight model (small size)
Multilingual support with good accuracy in deriving embeddings

Earlier I was using distilbert-base-nli-stsb-mean-tokens as it was light weight and had high accuracy compared to other model in the list.

Considering this criteria, which model will you recommend?

nreimers commented 3 years ago

For English I can recommend the MiniLM models, they are available with 3, 6, and 12 layers.

For Multilingual there are not so many choices, because models must be larger when they support more languages.

There I would recommend the paraphrase-multilingual-mpnet-base-v2 model

MyBruso commented 3 years ago

Thank you @nreimers, would you be able to point me to a link which shares the list of languages supported by paraphrase-multilingual-mpnet-base-v2?

nreimers commented 3 years ago

You can find the list on the page I linked above. At the bottom you can find the language information.