harsha-simhadri / big-ann-benchmarks

Framework for evaluating ANNS algorithms on billion scale datasets.
https://big-ann-benchmarks.com
MIT License
340 stars 114 forks source link

Support for remote servers #293

Open wahajali opened 4 months ago

wahajali commented 4 months ago

As I understand big-ann-benchmark is used for benchmarking algorithms. Looking at ann-benchmarks, they also have support for databases, such as postgres (pgvector) and redis. I have a few questions: 1) Can something similar be done with the big-ann-benchmark. For example, we add a docker container for pgvector, and have ann-benchmark test against it? 2) As a follow up to [1] - in case I want to test out a server running remotely, would that be possible with big-ann-benchmark. For example, if I have a database server running remotely that I want to benchmark. Per my understanding, all test are currently run in the same hardware where the benchmark is. 3) For each test/track there is a baseline algo defined. I'm trying to understand what is intended by the baseline. Is this the index that candidates are supposed to build on top of? For example, for the Neurips 23 ood track, the basaline is DiskAnn, why is it that the Dockerfile for pinecode-ood also includes the DiskAnn py lib to build the index?

maumueller commented 4 months ago

Hi @wahajali! I'm involved in both ann-benchmarks and this project, and I based the very initial code base largely on ann-benchmarks. It diverged a bit over the years, but the core architecture is still shared.

  1. Yes, the ann-benchmarks wrappers should be easy to translate into wrappers for this project.
  2. You would have to set up a bit of the infrastructure yourself and provide a wrapper module that translates the calls, e.g., for building the index, or carrying out a search. An approach like pgvector would work here in the same way.
  3. The framework is build around competitions that we organized at NeurIPS 2021 and NeurIPS 2023. With the proposed challenge, we provided baselines to provide examples and get some concrete performance/quality measurements to compare other solutions to. Some participants in these challenges build upon these baselines, others provided their own solution.

Hope that helps!