harsha-simhadri / big-ann-benchmarks

Framework for evaluating ANNS algorithms on billion scale datasets.
https://big-ann-benchmarks.com
MIT License
356 stars 118 forks source link

Re-organize runbooks and add wiki replace runbook #312

Closed magdalendobson closed 1 month ago

magdalendobson commented 1 month ago

This PR makes three contributions:

  1. Since the number of runbooks and generating files is getting larger, moved them to their own folder inside neurips23.
  2. Replaced the two generators for the existing MSMarco and Wiki-Cohere runbooks with one generic function (runbooks/gen_expiration_time_runbook.py).
  3. Uses that same generic function to add five additional expiration time based runbooks for Wiki-Cohere, which use a mix of delete and replace.
harsha-simhadri commented 1 month ago

Does this file location move break existing scripts and README?

magdalendobson commented 1 month ago

I updated the neurips23 README so all commands there should work as expected. I also updated data_export.py, which depends on runbook paths. Do you think this is too disruptive a change for people who have scripts they run offline?

harsha-simhadri commented 1 month ago

Could you please publish this draft PR. Also, lets plan to upload Gt for the new runbooks.

harsha-simhadri commented 1 month ago

Q:\big-ann-benchmarks> python benchmark/streaming/compute_gt.py --dataset wikipedia-35M --runbook neurips23/streaming/runbooks/wikipedia-35M_expiration_time_replace_runbook.yaml --gt_cmdline_tool c:\Users\harshasi\source\DiskANN\x64\Release\compute_groundtruth.exe Traceback (most recent call last): File "Q:\big-ann-benchmarks\benchmark\streaming\compute_gt.py", line 149, in main() File "Q:\big-ann-benchmarks\benchmark\streaming\compute_gt.py", line 111, in main max_pts, runbook = load_runbook(args.dataset, ds.nb, args.runbook_file) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "Q:\big-ann-benchmarks\benchmark\streaming\load_runbook.py", line 33, in load_runbook raise Exception('End of indices to be replaced out of range in runbook') ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Exception: End of indices to be replaced out of range in runbook