nmslib / hnswlib

Header-only C++/python library for fast approximate nearest neighbors
https://github.com/nmslib/hnswlib
Apache License 2.0
4.12k stars 609 forks source link

How to save incremental changes to the index? #485

Open siddhsql opened 11 months ago

siddhsql commented 11 months ago

Hi Yury - When this index is used in production in a real use-case scenario, users would add vectors incrementally and would want to save incremental changes. Suppose the index is already 100 GB (just making up a number) and I save the index. Then a add a few vectors - say 1000. I would now like to persist the changes to disk but I don't want to write the whole index again from scratch. Does the library handle that? How?

yurymalkov commented 11 months ago

Hi @siddhsql, The library does not handle it. During insertions the graph of the previously added elements also change. One can save the delta, but the delta is sparse and storing it might take even more space.