yahoojapan / NGT

Nearest Neighbor Search with Neighborhood Graph and Tree for High-dimensional Data
Apache License 2.0
1.24k stars 114 forks source link

NGT - ONNG Speed Issue #90

Closed ShikharJ closed 3 years ago

ShikharJ commented 3 years ago

Hey everyone,

I was trying to replicate some of the results from the ann-benchmarks on my system, and gave NGT-ONNG a test run on the SIFT1M dataset using the following commands, which I deduced from here and here:

# NGT ONNG Creation
ngt create -i t -p 8 -b 500 -g a -o f -D 2 -d 128 -E 100 -S 0 -e 0.1 -P 0 -B 30 -T 4 anng-index /mnt/SIFT1M/sift_base.tsv

# NGT ONNG Reconstruction
ngt reconstruct-graph -m S -o 10 -i 80 anng-index onng-index

# epsilon: [[0.6, 0.9, 0.995, 0.998, 1.0, 1.01, 1.02, 1.05, 1.07]]
# search_edge: 60
# tree_disabled: True
# NGT ONNG Searching
ngt search -e 0.6 -n 1 -E 60 onng-index /mnt/SIFT1M/sift_query.tsv

Everything works so far, so I doubt there is an installation issue with my system. However, the output latency is still pretty low compared to what I was expecting. Here's a sample output run:

...
Query Time= 0.00785451 (sec), 7.85451 (msec)
Query No.9997
Rank    ID  Distance
1   998427  124.992
Query Time= 3.9201e-05 (sec), 0.039201 (msec)
Query No.9998
Rank    ID  Distance
1   123856  211
Query Time= 0.0171489 (sec), 17.1489 (msec)
Query No.9999
Rank    ID  Distance
1   755328  208.485
Query Time= 0.00219046 (sec), 2.19046 (msec)
Query No.10000
Rank    ID  Distance
1   874344  199.439
Query Time= 0.00814252 (sec), 8.14252 (msec)
Average Query Time= 0.0292421 (sec), 29.2421 (msec), (292.421/10000)

This only results in a throughput of 10000/292.421 = 34 queries per second, which is far from the 10^4 numbers being observed on SIFT1M. Can anyone help me in speeding up the search process please?

masajiro commented 3 years ago

How to specify epsilon for ann benchmarks is a little different from that for the ngt command. Epsilon 0.6 in the line means 1.0 - 0.6 = -0.4 for the ngt command. Please see the line. Since 0.6 is too high for the ngt command, its latency is low.

ShikharJ commented 3 years ago

@masajiro Thanks for the insight! I'm able to see good latency numbers now. Is there a way of printing the recall value for the whole query set directly from the search command?

masajiro commented 3 years ago

The ngt has the eval command that computes the total recall for all of the specified queries. First, make a ground truth. It takes a little long.

ngt search -i s -n 10 -e 0.1 -o e onng-index query.tsv > gt

Make a target result.

ngt search -n 10 -e 0.1 -o e onng-index query.tsv > result

Evaluate the result with the ground truth.

> ngt eval gt result
# # of evaluated resultant objects per query=10
# Factor (Epsilon)      # of Queries    Precision       Time(msec)      # of computations       # of visted nodes
0.1     500     0.9982  0.513087        0       0

You can also specify the range of epsilon.

ngt search -n 10 -e 0:0.1:0.01 -o e onng-index query.tsv > result
> ngt eval gt result
# # of evaluated resultant objects per query=10
# Factor (Epsilon)      # of Queries    Precision       Time(msec)      # of computations       # of visted nodes
0       500     0.6994  0.0724443       0       0
0.01    500     0.7464  0.0431784       0       0
0.02    500     0.7964  0.0527388       0       0
0.03    500     0.8466  0.0605951       0       0
0.04    500     0.8938  0.0775902       0       0
0.05    500     0.9352  0.103419        0       0
0.06    500     0.9696  0.151118        0       0
0.07    500     0.9854  0.219897        0       0
0.08    500     0.9926  0.303011        0       0
0.09    500     0.9958  0.395865        0       0
0.1     500     0.9982  0.513087        0       0
ShikharJ commented 3 years ago

@masajiro Thanks for the help, I really appreciate it. I can see that the plots mentioned in the NGT repo depict ONNG to be the SOTA when it comes to a variety of datasets (even in my experiments on SIFT1M, I can see ONNG performing a lot better than other algorithms like HNSW), while the ann-benchmarks plots seem to depict an entirely different story. In a number of plots, ONNG doesn't even seem to be benchmarked by ann-benchmarks. Any idea why is that the case?

masajiro commented 3 years ago

Thank you for asking about the matter that I am concerned. ONNG can build the indexes that bring almost the best performance by using the setting in ann-benchmarks. On the other hand, it takes long time to build them compared to others. I think that as for the current results of ann-benchmarks, the default timeout 2 hours was used, then some executions of NGT probably were not able to finish. In addition, since ann-benchmarks uses just one core even for building indexes, it makes the build times longer.

ShikharJ commented 3 years ago

@masajiro Thanks for the insight, once again. Please feel free to close this issue as you see fit.

ShikharJ commented 3 years ago

@masajiro I have another question if you don't mind. Does the regular ONNG graphs make use of AVX-512, whenever applicable?

masajiro commented 3 years ago

If you build NGT on the computer with avx512, NGT uses avx512 for some distances. However, the search times with avx512 were almost the same as those with avx2 from my experiments.