Model.encode() processing time is high for larger lists

UKPLab / sentence-transformers

State-of-the-Art Text Embeddings

https://www.sbert.net

Apache License 2.0

15.17k stars 2.47k forks source link

Model.encode() processing time is high for larger lists #989

Open sivachaitanya opened 3 years ago

sivachaitanya commented 3 years ago

Thank you for the great library. In my application, I use model.encode([apilist]) for each api request and do cosine similarity of it with model.encode([unique_list]). This unique list is different from request to request and can have 200-500 values in length, while apilist is only 1 value in length. During my internal testing I found that model.encode([unqiue_list]) is taking significant processing power where CPU usage is peaking to 100% essentially slowing down the request processing time. Any suggestions on how to deal with this situation where I can speed up model.encode() for unique_list ?

nreimers commented 3 years ago

Hi @sivachaitanya

Yes, the models are quite compute intensive. Have a look here: https://www.sbert.net/docs/pretrained_models.html#sentence-embedding-models

For different models and how fast these are. If you need a quick model, I can recommend to use the 'paraphrase-MiniLM-L6-v2' model.

sivachaitanya commented 3 years ago

Thanks Nils will give it a try. Is there a way where we can use multiprocessing to fine tune model.encode() call for better performance ?

nreimers commented 3 years ago

Torch is already using multiple cores for the computation, if this is supported by your host system and CPU. Running multi-process encoding in that case will not really help.

Otherwise you can have a look here: https://www.sbert.net/examples/applications/computing-embeddings/README.html#multi-process-multi-gpu-encoding https://github.com/UKPLab/sentence-transformers/blob/master/examples/applications/computing-embeddings/computing_embeddings_mutli_gpu.py

You can pass a list with ['cpu', 'cpu', 'cpu'] to e.g. encode with 3 CPU processes.