yahoo / lopq

Training of Locally Optimized Product Quantization (LOPQ) models for approximate nearest neighbor search of high dimensional data in Python and Spark.
Apache License 2.0
562 stars 130 forks source link

Dynamic index update #29

Closed ashfaq92 closed 4 years ago

ashfaq92 commented 4 years ago

Hi!

Do I understand correctly that LOPQ does not currently support dynamic index update / adding new data to an existing dataset?

dmllr commented 4 years ago

Hi @ashfaq92,

You're right.

And, to be clear, dynamic update is far out of the scope of search algorithm in general and LOPQ particularly. In point of view of System Design, dynamic update should be basically a functionality of your Service using LOPQ as backbone search engine. It is very likely that you may replace LOPQ with something different but keep update subsystem unmodified.

ashfaq92 commented 4 years ago

Thank you for the prompt reply. Actually my problem includes performing 'n' number of nearest neighbor batch queries for 'n' randomly generated vectors to a dataset and inserting one (based on some criteria) of these randomly generated to the existing dataset. In this way, data-set will be incrementally increased each time after the batch queries are performed

can you please suggest any help in this case?

dmllr commented 4 years ago

@ashfaq92, I believe LOPQSearcher.add_data or LOPQSearcher.add_codes would help you.