nmslib / hnswlib

Header-only C++/python library for fast approximate nearest neighbors
https://github.com/nmslib/hnswlib
Apache License 2.0
4.36k stars 645 forks source link

Support "skipping" nodes during traversal? #294

Open sourcesync opened 3 years ago

sourcesync commented 3 years ago

Is it possible to tell the HNSW algorithm to skip a given set of nodes during search/traversal?

Note that I do not want "delete" nodes.

I just want the traversal to skip a set of pre-determined nodes when it's deciding to add any of those "skip nodes" to the list of nearest neighbors, as its traversing the graph.

I think this means that for a "Top-K" list of nearest neighbors - the kind of search I describe may have to perform a few more traversals to get to K items ( since it may have "skipped" a few nodes along the way. )

I don't want "delete" because the "skip node list" may be different between searches on the same graph.

yurymalkov commented 3 years ago

Hu @gosha1128, That sure if possible, but requires alteration of the C++ code (can point out to the place in the code).

sourcesync commented 3 years ago

Yes that would be great- thanks.

yurymalkov commented 3 years ago

In the main search cycle https://github.com/nmslib/hnswlib/blob/master/hnswlib/hnswalg.h#L249 you can replace all isMarkedDeleted(some_internal_id) to a check that some_internal_id does not belong to this node set.