erikbern / ann-benchmarks

Benchmarks of approximate nearest neighbor libraries in Python
http://ann-benchmarks.com
MIT License
4.84k stars 726 forks source link

Use angular datasets for inner product? #537

Open wdongyu opened 1 month ago

wdongyu commented 1 month ago

Question

Can I use the "ground truth" part of angular datasets for inner product, after normalizing the "data" part. If not, are there any datasets for inner product? I have found music-100, but it seems unavailable now :(

maumueller commented 1 month ago

We are computing similarity by the inner product, so yes, you can do that for the datasets in this repo.

The billion-scale ann benchmarks project contains additional datasets: https://github.com/harsha-simhadri/big-ann-benchmarks for inner product similarity.