nmslib / hnswlib

Header-only C++/python library for fast approximate nearest neighbors
https://github.com/nmslib/hnswlib
Apache License 2.0
4.11k stars 607 forks source link

Question : Implementing disk paging for index larger than memory #544

Open sh22iyer opened 3 months ago

sh22iyer commented 3 months ago

Greetings!

Thanks for authoring the most developer friendly library!

Is there any design note on how to extend the C++ library for index'es larger than available RAM? Would a design centre'd on adding an indirection to getDataByInternalId() to fetch the raw vector either from memory or disk cache or real disk be something worthwhile to try out? The links and other data structure members will remain in main memory, just focusing on raw vector memory use when dimension is 1536/2048/3072+ . Thanks in advance!

Regards