bwlewis / irlba

Fast truncated singular value decompositions
127 stars 17 forks source link

difficulty finding invariant subspace #26

Closed bwlewis closed 7 years ago

bwlewis commented 7 years ago

The issue is that the tolerance for detecting an invariant subspace, internally set to 2 * eps, is too small. In some cases, irlba moves on past the subspace and blows up. Here is a rank 10 example:

library(irlba)

set.seed(1)
r = 10
n = 1000
X1 = matrix(rnorm(n * r), n) 
X2 = matrix(rnorm(n * r), n)
X = X1 %*% t(X2)

irlba(X, 20, fastpath=FALSE, verbose=TRUE, svtol=Inf)$d
#Working dimension size 27
#iter= 1, mprod= 54, sv[20]=9.69e-10, %change=, k=23
#iter= 2, mprod= 62, sv[20]=7.59e-07, %change=9.99e-01, k=23
# [1] 5.274810e+06 3.966244e+04 1.140974e+03 1.059975e+03 1.044177e+03
# [6] 1.023285e+03 9.910259e+02 9.744555e+02 9.668851e+02 9.537244e+02
#[11] 9.332256e+02 9.053668e+02 3.587707e+02 3.964667e+00 5.473509e-02
#[16] 4.585135e-03 9.709451e-04 2.294908e-05 2.173387e-05 7.588123e-07

# even worse!
irlba(X, 20, work=35, fastpath=FALSE, verbose=TRUE, svtol=Inf)$d
#Working dimension size 35
#iter= 1, mprod= 70, sv[20]=1.53e+11, %change=, k=23
#iter= 2, mprod= 94, sv[20]=3.20e+12, %change=9.52e-01, k=23
# [1] 3.209668e+28 7.737559e+25 2.065238e+23 6.129716e+20 2.035176e+18
# [6] 6.530794e+17 7.609339e+15 1.343803e+15 3.212394e+13 3.199367e+12
#[11] 3.199367e+12 3.199367e+12 3.199367e+12 3.199367e+12 3.199367e+12
#[16] 3.199367e+12 3.199367e+12 3.199367e+12 3.199367e+12 3.199367e+12

head(svd(X)$d, 5)
#[1] 1140.9737 1059.9747 1044.1772 1023.2846  991.0259

# This is OK:
irlba(X, 5)$d
#[1] 1140.9737 1059.9747 1044.1772 1023.2846  991.0259

# And this is OK:
irlba(X, 20, work=25, fastpath=FALSE, verbose=TRUE, svtol=Inf)$d
#Working dimension size 25
#iter= 1, mprod= 50, sv[20]=1.16e-11, %change=, k=23
#iter= 2, mprod= 54, sv[20]=1.15e-10, %change=9.00e-01, k=23
# [1] 1.140974e+03 1.059975e+03 1.044177e+03 1.023285e+03 9.910259e+02
# [6] 9.744555e+02 9.668851e+02 9.537244e+02 9.332256e+02 9.053668e+02
#[11] 8.502320e+01 9.502886e-01 1.334770e-02 2.414586e-04 5.837561e-06
#[16] 1.982668e-07 2.297901e-08 1.013727e-08 8.461105e-10 1.154962e-10

One possible solution is to adjust the tolerance for detection of invariant subspace.

bwlewis commented 7 years ago

fixed in commit 6904f11a62b36125f89ef415f5861d104f84457d