koheiw / proxyC

R package for large-scale similarity/distance computation
GNU General Public License v3.0
29 stars 6 forks source link

proxyC incorrectly assumes symmetry #3

Closed rcannood closed 5 years ago

rcannood commented 5 years ago

@zouter kindly pointed out this bug to me. When x and y have the same number of samples (nrow(x) == nrow(y)), it is assumed that the resulting matrix is symmetric, whereas this is not the case when x is not equal to y.

Here is an example:

set.seed(1)
x <- Matrix::rsparsematrix(3, 10, .2)
y <- Matrix::rsparsematrix(3, 10, .2)
rownames(x) <- letters[1:3]
rownames(y) <- LETTERS[1:3]

dis <- proxyC::dist(x, y, method = "euclidean")
dis2 <- proxy::dist(as.matrix(x), as.matrix(y), method = "euclidean")
> dis
3 x 3 sparse Matrix of class "dsTMatrix"
          A        B         C
a 1.0600472 2.253965 0.8709334
b 2.2539645 3.323209 2.3839725
c 0.8709334 2.383973 1.4868171
> dis2
          A         B         C
a 1.0600472 2.2539645 0.8709334
b 2.1357200 3.3232087 2.3839725
c 0.2000000 2.3211756 1.4868171
> dis2 - dis
3 x 3 sparse Matrix of class "dgCMatrix"
           A           B C
a  .          .          .
b -0.1182445  .          .
c -0.6709334 -0.06279696 .