Open godofdream opened 5 years ago
I suppose @yurymalkov can say more about it. :-)
@godofdream, Yes, there is a conceptual difference. ivf-hnsw uses hnswlib as a part and it targets billion scale scenario with low memory footprint (e.g. 1B index in 20 Gb RAM). So, it should not be as fast as hnswlib, but it is much more memory efficient. Also, @dbaranchuk is the right person to ask about it. :)
great if anyone wants to add it!
@yurymalkov : any comments on the quality of the results if the index could be possibly stored in hnswlib as well
@prateekpatelsc AFAIK it is using a fork of hnswlib, which should have a similar performance (although I haven't tested).
@yurymalkov : sorry may be my question was not clear. I meant to ask that hnsw-ivf uses quantizers (from a quick look at code), i was curious if that causes any lower recall quality for example for a dataset of size 1 million (so could have used either hnswlib or hsnw-ivf index approach). I was more curious about how much perf degrades in terms of quality and latency/throughput if i decide to use hnsw-ivf as opposed to hnswlib .
Also i do see https://arxiv.org/pdf/1812.09162.pdf , which tries to further reduce memory footprints and optimize on performance on top of hnsw-ivf indexes, wondering if you had a chance to look at this as well ?
I stumbled over https://github.com/dbaranchuk/ivf-hnsw
Did anyone tried to benchmark their implementation? Is there conzeptual difference between ivf-hnsw and hnsw from hnswlib or nswlib?