koheiw / proxyC

R package for large-scale similarity/distance computation
GNU General Public License v3.0
29 stars 6 forks source link

Make it possible to apply both min_simil and rank #1

Closed koheiw closed 5 years ago

koheiw commented 5 years ago

If the values have only a few non-zero values, top-n values include zeros. For example, the fifth largest value in c(0, 0, 0.5, 0.9, 0, 0, 0, 0.1) is zero. We should use both min_simil = 0.1 along with rank = 5 to exclude zeros.

koheiw commented 5 years ago

We can also add an argument drop0 that works like Matrix::drop0().

kasperwelbers commented 5 years ago

A drop0 argument could work. Only think to consider is what to do if values can be negative (or aren't there any similarity measures in use with negative values?)