yahoojapan / NGT

Nearest Neighbor Search with Neighborhood Graph and Tree for High-dimensional Data
Apache License 2.0
1.26k stars 115 forks source link

More Detail Description about NGTQG? #110

Closed OctoberChang closed 1 year ago

OctoberChang commented 2 years ago

Thanks again for open-sourcing the NGT library. From your benchmark results [link], I found the performance of NGT-qg very competitive, on par with the state-of-the-art quantization method scann.

I would like to learn more about the details of NGT-qg method. It would be great if you can point me to any technical report about the algorithm design.

In particular, I am interested in the details such as (0) is the main idea of NGT-qg about replacing exact distance computation with approximated distance via Product Quantization? For example, first graph search with approximate distances to get candidates, then re-rank by exact distance computations? (1) how you quantize node feature? Is it product quantization on the residual vectors or just on the original data vectors? (2) to compute distance between a query and a quantized data vector, do you use the IVF-ADC trick with distance lookup tables? (3) do you use 4-bit FastScan technique [link] as used in faiss-IVFPQfs or ScaNN?

SAKUrie commented 2 years ago

I am also very interested in these details ! I found that ngt-QG uses something like PQ quantization, but product quantization is not on the original data but residual data like edges, between two points ,which confused me in understanding ngt-QG. I wonder if you have solved the confusion at that time now.

masajiro commented 1 year ago

Hopefully this will help you to understand QG.