nmslib / hnswlib

Header-only C++/python library for fast approximate nearest neighbors
https://github.com/nmslib/hnswlib
Apache License 2.0
4.3k stars 633 forks source link

Concurrency and Locking in hnswlib for Online Indexing #590

Open qscqesze opened 2 weeks ago

qscqesze commented 2 weeks ago

Description

I am using hnswlib for online indexing, where I continuously add vectors to my vector index library from a message queue using multithreading. Additionally, I perform multithreaded queries and execute a save operation after writing a certain number of items.

Question

As a user of hnswlib, do I need to implement locking to handle these operations effectively?

Operations:

Additional Information

Any recommendations on how to manage concurrency and ensure data integrity would be greatly appreciated.

slhuang commented 1 week ago

I was also looking into the concurrent read/write in HNSWLib. From my understanding, locking is performed during write (addPoint/deletePoint), but not read (searchKnn). Can someone confirm whether this is true? And is non-locking read done purposely for performance consideration? @yurymalkov Thanks.