ThomasDelteil / VisualSearch_MXNet

Visual Search using Apache MXNet and gluon
234 stars 54 forks source link

HNSW model #2

Closed ChenmiaoYu closed 5 years ago

ChenmiaoYu commented 6 years ago

Hello, thanks very much for your contribution. I have a question I would like to ask. Every time I search for an image, do I have to reload the HNSW model? I have 20 million images here, and loading the model will be very slow. Is there any way to turn the search into a service? or how do you get the results so quickly?,that is so cool, thank you

ThomasDelteil commented 6 years ago

@ChenmiaoYu, no you don't have to do that, if you look in the mms folder you will see the elements you need in order to deploy the service using MXNet Model Server. I am actually planning to do a more detailed write-up where I include the details.

In short, you just load the index at start-up time in your service and you just query it after you get the features. Have a look at the service.py file.

ChenmiaoYu commented 6 years ago

Thank you very much for your answer. I will mainly study how to deploy the server, but I found that the accuracy of the model in my dataset is not very high. When reaching millions of vectors, the top1 query of the first 10,000 vectors. Only half can return to themselfs (accuracy is only 50%), I use a 128-dimensional vector, is the vector dimension too small to store enough information?

ThomasDelteil commented 6 years ago

@ChenmiaoYu you can also play with the ef parameter of your index. p.set_ef(1000) Higher values will return less approximate knn searches.

visuazn commented 5 years ago

Hey sorry. Correct me if am wrong. I checked in document and it says higher ef will give more accuracy but it will be slower.

https://github.com/nmslib/hnswlib/blob/master/ALGO_PARAMS.md