Open Matthieu-Tinycoaching opened 3 years ago
Yes, pickle is fine.
The question is at what time exact search will be too slow and when you will need ANN. On CPU, you can do exact search up to 100k - 500k. If you have your corpus on a GPU, you could fit up to 5M.
After that, using ANN makes sense: https://www.sbert.net/examples/applications/semantic-search/README.html#approximate-nearest-neighbor
Thanks @nreimers for your feedback.
@nreimers I have a follow up question related to the use of containerized docker image for inference on cloud services. What would be the most efficient way to load calculated corpus embeddings on each query request: call to a cloud database on each request or load pickle embeddings on each request?
Thanks for your feedback, it was easy to deal with this problem on local computer but when thinking of deployment on cloud services, it is less evident to think on which is the fastest way to load embeddings.
Hi @Matthieu-Tinycoaching I think loading the pickle file would be the most efficient.
You could also think about deploying a vector search database like Elasticsearch, OpenSearch/OpenDistro, Vespa.ai etc.
Hi,
I would like to use sentence-transformer for semantic similarity. On the one hand I have query sentence whose embeddings will be computed on the fly. On the other hand I have corpus sentences whose embeddings will be pre-computed before launch of the application. What are the best options for storing IDS + pre-computed embeddings of corpus sentences?
1) Is pickle a good solution and up to which number of sentences that it works well? 2) When is it needed to use FAISS storage or a particular DB?
Thanks!