ann-benchmarks? - Githubissues

Hillier98 commented 3 weeks ago

Hi, do you have any plans to add falconnPP to ANN benchmarks? I'm asking because in your paper you compare against FAISS using some of the datasets from ANN benchmarks, so maybe you have already done it locally. If not, any hints on how to reproduce the plots in the paper? Thank you.

NinhPham commented 1 week ago

I do not plan to add falconnPP to ANN benchmark for several reasons. First, ANN benchmark supports single thread while Falconn++ needs multi-thread to show the advantage. Second, there might be engineering effort to make FalconnPP run faster on single thread.

You can reproduce the outcome by changing the qProbes suggested in the paper.

For example, on Glove 200, Index setting: n_tables = 350, n_proj = 256, bucketLimit = 20, iProbes = 3, alpha = 0.01 Query setting: qProbes: {1000 ... 10,000}

alpha: 0.01: We scale the bucket size with alpha.
bucketLimit = 20: However, this is the minimum bucket size to after scaling.
These two params control the size of bucket on dense/sparse regions, which affects the indexing size and querying time.

import FalconnPP index = FalconnPP.FalconnPP(n_points, n_features) index.setIndexParam(n_tables, n_proj, bucketLimit, alpha, iProbes, n_threads) index.build(dataset_t) # add vectors to the index, must transpose to D x N

index.set_qProbes(qProbes) # set multi-probes for querying fal_answers = index.query(queries_t, k)

Hillier98 commented 1 week ago

Hi @NinhPham, got it, thanks for getting back!

NinhPham commented 6 days ago

Anyway, thank for your request. I might do some experiment with Faiss on some ANNS benchmark data sets (on both single and multiple threads) and post it here in future.

NinhPham / FalconnPP

ann-benchmarks? #5