cgtuebingen / ggnn

GGNN: State of the Art Graph-based GPU Nearest Neighbor Search
https://uni-tuebingen.de/fakultaeten/mathematisch-naturwissenschaftliche-fakultaet/fachbereiche/informatik/lehrstuehle/computergrafik/lehrstuhl/veroeffentlichungen/ggnn-graph-based-gpu-nearest-neighbor-search/
MIT License
139 stars 22 forks source link

How to get the high-quality knn graph without refinement as mentioned in the paper? #5

Open RayWang96 opened 3 years ago

RayWang96 commented 3 years ago

Hello! Thanks for your excellent work. I'm trying your demo. I am confused that I can't get the high-quality knn graph without refinement as mentioned in the paper. For SIFT1M, I can only get C@10 of 0.5022 when KBuild=62 (Any larger KBuild will lead to an illegal memory error).

It is mentioned in the paper that your algorithm achieves C@10 of 0.987 without refinement. Please kindly tell me that how do I get that result. Besides, I used the getGraph function to export the graph, please let me know if I get something wrong. Thank you again!

LukasRuppert commented 3 years ago

Hi, I've looked up the corresponding configuration. Use K = 24, tau_build = 0.7, refinements = 0, tau_query = 1.08. At the end of demo.cpp add query_function(1.08f); and run the program using -tau 0.7 -refinement_iterations 0.

e.g. ./demo -base_filename sift_base.fvecs -query_filename sift_query.fvecs -groundtruth_filename sift_groundtruth.ivecs -tau 0.7 -refinement_iterations 0 -gpu_id=0 --v=4 -logtostderr

With these parameters, I get the following result:

I20210206 12:57:57.716079 10807 cuda_knn_query_layer.cuh:53] QueryKernel -- KQuery: 10 MAX_ITERATIONS: 400 CACHE_SIZE: 512
ms: 222.001144 for 10000 queries -> 0.022200 ms/query 
I20210206 12:57:57.939872 10807 cuda_knn_ggnn.cuh:137] c@1: 0.9871
I20210206 12:57:57.939880 10807 cuda_knn_ggnn.cuh:138] c@10: 0.98721
I20210206 12:57:57.939883 10807 cuda_knn_ggnn.cuh:140] mean calculated dists: 6310.51

Note that always two versions of the query are being executed per tau_query, so this will be the second last result.

The getGraph function will just give you the neighborhoods in the graph. For our query, some more metadata is necessary (such as the distances to the nearest neighbor per point in the graph and the mean and max over these distances). There is some dedicated serialization code, but we also noticed a bug in the currently published version. It does not properly export the upper layers of the graph.

We will soon release an updated version of the code which also addresses the issues you experienced with KBuild > 62. Part of it is the requirement that S > KBuild/2, but there are quite a few more changes necessary.

RayWang96 commented 3 years ago

Thank you very much for your reply! I mistakenly thought the results in the paper were the quality of the graph. Now I get the same result.