Adapted multilingual model for symmetric semantic search

UKPLab / sentence-transformers

State-of-the-Art Text Embeddings

https://www.sbert.net

Apache License 2.0

14.98k stars 2.45k forks source link

Adapted multilingual model for symmetric semantic search #1040

Open Matthieu-Tinycoaching opened 3 years ago

Matthieu-Tinycoaching commented 3 years ago

Hi,

Would the paraphrase-multilingual-MiniLM-L12-v2 model be adapted for symmetric semantic search?

Thanks!

nreimers commented 3 years ago

You can already use the model for that

Matthieu-Tinycoaching commented 3 years ago

Thanks @nreimers !

Could you estimate the compute time of util.semantic_search for one query embeddings and thousands of corpus embeddings relatively to the compute time of predicting the one query embeddings ?

I already load tested sentence-transformer uniquely for predicting embeddings, but I can't figure out if this viable for cloud deployment to further adding util.semantic_search?

nreimers commented 3 years ago

Depends on too many factors. Simply try it with some random embeddings which you can generate with numpy (e.g. nx768 random matrix by numpy)