rapidsai / cuml

cuML - RAPIDS Machine Learning Library
https://docs.rapids.ai/api/cuml/stable/
Apache License 2.0
4.25k stars 534 forks source link

[BUG] pairwise_distances calculate the wrong answer #6075

Closed QuotoAstro closed 2 months ago

QuotoAstro commented 2 months ago

Describe the bug

When use cuml.metrics.pairwise_distances calculate cosine similarity, the program return the wrong answer.

Steps/Code to reproduce bug

As shown in the doc. Image

Expected behavior

Actually, for example, the expected result is 0.554 for the [2, 3] and [1, 0]. Seems all the wrong results are resulted from 1 - correct result.

cjnolet commented 2 months ago

@QuotoAstro,

The pairwise distances calculate distance, not similarity. This is consistent with Sciki-learn also. As you’ve noted, you can turn the distance into a similarity by subtracting all the values from 1.