bwlewis / irlba

Fast truncated singular value decompositions
127 stars 17 forks source link

Specifying both center= and scale= does not yield consistent results #42

Closed LTLA closed 5 years ago

LTLA commented 6 years ago

Consider the following code snippet:

set.seed(1234)
a <- matrix(rnorm(10000), ncol=20)
center <- runif(ncol(a))
scale <- runif(ncol(a))

library(irlba)
set.seed(100)
out <- irlba(a, center=center, scale=scale, k=5, fastpath=FALSE)
out$d
## [1] 489.44223 316.87816 124.50882 121.74289  97.49367
set.seed(100)
ref <- irlba(t((t(a) - center)/scale), k=5, fastpath=FALSE)
ref$d
## [1] 531.5186 349.5748 126.9928 124.3211 106.4896

Interestingly, specifying only one of center and scale yields consistent results, which suggests that the above problem is caused by an interaction between these two. Some testing indicates that lines 542-543 are the offenders. In particular, it seems like line 543 should have dv / scale instead if scale is defined. This seems to fix the problem for fastpath=FALSE, though note that the same issue is present in the C code.

Session information ``` R version 3.5.0 Patched (2018-04-30 r74679) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 16.04.5 LTS Matrix products: default BLAS: /home/cri.camres.org/lun01/Software/R/R-3-5-branch/lib/libRblas.so LAPACK: /home/cri.camres.org/lun01/Software/R/R-3-5-branch/lib/libRlapack.so locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] irlba_2.3.3 Matrix_1.2-14 loaded via a namespace (and not attached): [1] compiler_3.5.0 grid_3.5.0 lattice_0.20-35 ```
bwlewis commented 5 years ago

I apologize for the very long latency, reviewing now...

bwlewis commented 5 years ago

re-opening until fastpath=TRUE is fixed.

bwlewis commented 5 years ago

OK, your fix is now merged into the C implementation too, and a new test covering this problem was added. Thanks again for your help and fix!

This is an imporatant bug so I'll try to get a new CRAN release ready as soon as possible. Please let me know if you find anything wrong with the fix...

Best,

Bryan