FALCONN-LIB / FALCONN

FAst Lookups of Cosine and Other Nearest Neighbors (based on fast locality-sensitive hashing)
http://falconn-lib.org/
MIT License
1.13k stars 193 forks source link

Add a dynamic LSH table #2

Open ludwigschmidt opened 8 years ago

hxwu commented 7 years ago

Excuse me, could you please show me how to add a dynamic LSH table?

hxwu commented 7 years ago

I just want to insert or remove some record from the established LSH table

ludwigschmidt commented 7 years ago

Hi hxwu,

updates are supported in some of the core classes, in particular the DynamicLinearProbingHashTable and the DynamicCompositeHashTable:

https://github.com/ludwigschmidt/fth_lsh/blob/master/lsh_table/src/probing_hash_table.h https://github.com/ludwigschmidt/fth_lsh/blob/master/lsh_table/src/composite_hash_table.h

What we need next is a dynamic version of the StaticLSHTable in

https://github.com/FALCONN-LIB/FALCONN/blob/master/src/include/falconn/core/lsh_table.h

We should be able to re-use quite a bit of code here. The main difference is that the static LSH table always assumes that the key range is between 0 and data_set_size - 1. This allows us to simply keep a large array for checking whether a certain candidate already appears in the answer for the current query. If the key range is arbitrary, we need a less specialized set data structure here (a hash table would probably be the right thing). Overall it should not be a big performance issue, but it's good to get it right.

The code is organized so that the query objects in the LSHTable are only responsible for producing a list of candidates. The actual distances are then computed in the NearestNeighborQuery class:

https://github.com/ludwigschmidt/fth_lsh/blob/master/lsh_table/src/nn_query.h

Two questions about your application:

Cheers, Ludwig

pengfu008 commented 7 years ago

Insert and remove record is very important in my application, I need to dynamic update LSH table, I use python interface, Do you have the plan to support it? @ludwigschmidt