Cecca / role-of-dimensionality

MIT License
3 stars 0 forks source link

Add dist-rmse score, addresses #5 #10

Closed Cecca closed 3 years ago

Cecca commented 3 years ago

Compute the root mean squared error between the true k-nn distances and the reported k-nn distances.

The idea is that a low value means that we are not including in the reported points a point that is too far away.

maumueller commented 3 years ago

I totally forgot that we also have "relative distances" already in ann-benchmarks. It's value is the sum of distances of the true NN and the reported NN.

glove-rmse

glove-rel

I think the relative distance also has the advantage that it provides some normalization to the distances that are usually found in the dataset, whereas the dist-rmse score doesn't as far as I can see. (E.g., for mnist we usually have distances in the thousands.)

Cecca commented 3 years ago

Very nice! Then I think we can just use the relative distance, though they seem to have a very similar behaviour.

Shall we just close this PR?