zilliztech / VectorDBBench

A Benchmark Tool for VectorDB
MIT License
519 stars 133 forks source link

GIST Ground Truth Data Missing #292

Open wahajali opened 6 months ago

wahajali commented 6 months ago

I want to run the Search Performance Test on the GIST dataset. I created a new test, since current workloads don't have GIST as part of the performance test. Currently GIST and SIFT are only used in capacity test.

However, the dataset doesn't contain the ground truth data. It only downloads train.parquet and doesn't download the ground truth data (I believe that would be neighbors.parquet).

alwayslove2013 commented 6 months ago

Right. We are considering opening up more datasets in the next release, as well as supporting users with their own local datasets.

Currently GIST and SIFT are only used in capacity test. the dataset doesn't contain the ground truth data. I believe that would be neighbors.parquet

xinhuitian commented 2 months ago

Right. We are considering opening up more datasets in the next release, as well as supporting users with their own local datasets.

Currently GIST and SIFT are only used in capacity test. the dataset doesn't contain the ground truth data. I believe that would be neighbors.parquet

@alwayslove2013 Also need this! Any update here? or is there any way to generate neighbor.parquet from the origin gist and sift ground truth files? thx