Open sivachaitanya opened 3 years ago
Hi @sivachaitanya
Yes, the models are quite compute intensive. Have a look here: https://www.sbert.net/docs/pretrained_models.html#sentence-embedding-models
For different models and how fast these are. If you need a quick model, I can recommend to use the 'paraphrase-MiniLM-L6-v2' model.
Thanks Nils will give it a try. Is there a way where we can use multiprocessing to fine tune model.encode() call for better performance ?
Torch is already using multiple cores for the computation, if this is supported by your host system and CPU. Running multi-process encoding in that case will not really help.
Otherwise you can have a look here: https://www.sbert.net/examples/applications/computing-embeddings/README.html#multi-process-multi-gpu-encoding https://github.com/UKPLab/sentence-transformers/blob/master/examples/applications/computing-embeddings/computing_embeddings_mutli_gpu.py
You can pass a list with ['cpu', 'cpu', 'cpu'] to e.g. encode with 3 CPU processes.
Thank you for the great library. In my application, I use model.encode([apilist]) for each api request and do cosine similarity of it with model.encode([unique_list]). This unique list is different from request to request and can have 200-500 values in length, while apilist is only 1 value in length. During my internal testing I found that model.encode([unqiue_list]) is taking significant processing power where CPU usage is peaking to 100% essentially slowing down the request processing time. Any suggestions on how to deal with this situation where I can speed up model.encode() for unique_list ?