cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.16k stars 3.82k forks source link

vecbench: add vector benchmarking CLI #135844

Closed andy-kimball closed 6 hours ago

andy-kimball commented 1 day ago

Add a CLI that makes it easy to benchmark the quality and performance of vector indexing. The CLI downloads test datasets from a GCP bucket and then builds and searches the index. It outputs results in a spreadsheet-friendly format like this:

  unsplash-512-euclidean
  1000000 train vectors, 1000 test vectors, 512 dimensions, 16/128 min/max partitions, base beam size 8

  beam  recall  leaf    all full    partns  qps
  1 22.10%  91  247 23.61   4.00    1357.12
  2 31.35%  182 339 27.65   5.00    1867.50
  4 47.86%  362 610 31.96   8.00    1783.30
  8 67.96%  727 1220    35.70   15.00   1729.00
  16    82.00%  1450    2302    40.41   27.00   1629.65
  32    90.70%  2894    4462    44.17   51.00   1301.63
  64    95.61%  5783    8772    47.30   99.00   791.74
  128   98.32%  11559   17374   49.60   195.00  535.10
  256   99.47%  23099   34391   50.83   387.00  298.24
  512   99.83%  46150   57517   51.28   644.00  189.69

Epic: CRDB-42943

Release note: None

cockroach-teamcity commented 1 day ago

This change is Reviewable

andy-kimball commented 6 hours ago

bors r=drewkimball

craig[bot] commented 6 hours ago

Build succeeded: