bwlewis / irlba

Fast truncated singular value decompositions
126 stars 17 forks source link

Inconsistent results with complex matrices #55

Open eliocamp opened 3 years ago

eliocamp commented 3 years ago

I'm using {irlba} to perform svd on complex matrices and found that the results where inconsistent. In my real code, I had an issue that the resulting principal values depended on the order of the rows. Trying to create a reproducible example, I realised that the solution was inconsistent between different calls even if the matrix remained the same

set.seed(42)
N <- 10
M <- 40

m <- matrix(complex(real = rnorm(M*N), imaginary = rnorm(M*N)),
             ncol = N, nrow = M)

first_pc <- function(matrix, package) {
  if (package == "svd") {
    s <- svd(matrix, nu = 2)  
  } else {
    s <- suppressWarnings(irlba::irlba(matrix, nu = 2, rng = runif) )
  }

  s$v[, 1]
}

Plotting the first of several calls to irlba::irlba, the results change considerably!

plot(first_pc(m, "irlba"), type = "l")
lines(first_pc(m, "irlba"))
lines(first_pc(m, "irlba"))
lines(first_pc(m, "irlba"))

Using base::svd, the results are always the same:

plot(-first_pc(m, "svd"), type = "l")
lines(-first_pc(m, "svd"))
lines(-first_pc(m, "svd"))
lines(-first_pc(m, "svd"))

With real values, this doesn't happen:

m <- matrix(rnorm(M*N),
            ncol = N, nrow = M)

plot(first_pc(m, "irlba"), type = "l")
lines(first_pc(m, "irlba"))
lines(first_pc(m, "irlba"))
lines(first_pc(m, "irlba"))

Created on 2020-11-02 by the reprex package (v0.3.0)

bwlewis commented 3 years ago

Hi, sorry about this behavior! The good news is it should be easy to diagnose. I'm investigating...

On Mon, Nov 2, 2020 at 6:02 PM Elio Campitelli notifications@github.com wrote:

I'm using {irlba} to perform svd on complex matrices and found that the results where inconsistent. In my real code, I had an issue that the resulting principal values depended on the order of the rows. Trying to create a reproducible example, I realised that the solution was inconsistent between different calls even if the matrix remained the same

set.seed(42)N <- 10M <- 40 m <- matrix(complex(real = rnorm(MN), imaginary = rnorm(MN)), ncol = N, nrow = M) first_pc <- function(matrix, package) { if (package == "svd") { s <- svd(matrix, nu = 2) } else { s <- suppressWarnings(irlba::irlba(matrix, nu = 2, rng = runif) ) }

s$v[, 1] }

Plotting the first of several calls to irlba::irlba, the results change considerably!

plot(first_pc(m, "irlba"), type = "l") lines(first_pc(m, "irlba")) lines(first_pc(m, "irlba")) lines(first_pc(m, "irlba"))

https://camo.githubusercontent.com/defe184c03bc39ae2d56d90bbcf73456c75c903e/68747470733a2f2f692e696d6775722e636f6d2f743476314552422e706e67

Using base::svd, the results are always the same:

plot(-first_pc(m, "svd"), type = "l") lines(-first_pc(m, "svd")) lines(-first_pc(m, "svd")) lines(-first_pc(m, "svd"))

https://camo.githubusercontent.com/2f44ec162af0e6f8e548d90ac197412870c799b2/68747470733a2f2f692e696d6775722e636f6d2f5959515037575a2e706e67

With real values, this doesn't happen:

m <- matrix(rnorm(M*N), ncol = N, nrow = M)

plot(first_pc(m, "irlba"), type = "l") lines(first_pc(m, "irlba")) lines(first_pc(m, "irlba")) lines(first_pc(m, "irlba"))

https://camo.githubusercontent.com/a140cab5daad655dc490536d1f8998c8b336c8e3/68747470733a2f2f692e696d6775722e636f6d2f326f56357957302e706e67

Created on 2020-11-02 by the reprex package https://reprex.tidyverse.org (v0.3.0)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/bwlewis/irlba/issues/55, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABCZHXUIV5GJCY2GPTRBODSN4277ANCNFSM4TIBPJCA .

eliocamp commented 2 years ago

Hi, I've been having issues with this for a while and I've just now realised that this is not necessarily a bug. It's just that complex SVDs are undefined up to a rotation angle!

set.seed(42)
N <- 10
M <- 40

m <- matrix(complex(real = rnorm(M*N), imaginary = rnorm(M*N)),
            ncol = N, nrow = M)

# Baseline svd
base <- irlba::irlba(m, 2)

rotate <- function(z, angle = 0) {
   complex(real = cos(angle), imaginary = sin(angle)) * z
}

first_pc <- function(matrix, package) {
   if (package == "svd") {
      s <- svd(matrix, nu = 2)  
   } else {
      s <- suppressWarnings(irlba::irlba(matrix, nu = 2))
   }

   # Rotate v so that is has the maximum corraltion with 
   # the baseline value. 
   angles <- seq(0, 2*pi, length.out = 80)

   cors <- vapply(angles, function(angle) {
      v <- rotate(s$v[, 1], angle)
      cor(Re(v), Re(base$v[, 1]))
   }, FUN.VALUE = 1)

   rotate(s$v[, 1], angles[which.max(cors)])
}

plot(first_pc(m, "irlba"), type = "l")
lines(first_pc(m, "irlba"))
lines(first_pc(m, "irlba"))
lines(first_pc(m, "irlba"))

Created on 2021-12-07 by the reprex package (v2.0.1)

I don't know if there's anything that can make the algorithm more deterministic, in that it returns always the same svd decomposition for the same input (like base::svd() does), but in any case, this is just a property of complex SVD; it's just not unique. ¯_(ツ)_/¯