pneuvial / c3co

Inferring cancer cell clonality from copy-number data
5 stars 1 forks source link

Negative PVE #19

Open mpierrejean opened 7 years ago

mpierrejean commented 7 years ago

Sometimes algorithm produces negative PVE. Try to understand why? May be a problem of convergence. This is an example which produces a negative PVE.

dataAnnotTP <- acnr::loadCnRegionData(dataSet="GSE11976", tumorFrac=1)
dataAnnotN <- acnr::loadCnRegionData(dataSet="GSE11976", tumorFrac=0)
len <- 500*10
nbClones <- 3
set.seed(88)
bkps <- list(c(100, 250)*10, c(150, 400)*10, c(150, 400)*10)
regions <- list(c("(1,2)", "(0,2)", "(1,2)"),
c("(0,3)", "(0,1)", "(1,1)"), c("(0,2)", "(0,1)", "(1,1)"))
datSubClone <- buildSubclones(len, dataAnnotTP, dataAnnotN,
                               nbClones, bkps, regions)
M <- rSparseWeightMatrix(15, 3, 0.7)
dat <- mixSubclones(subClones=datSubClone, M)
l2=0.1
parameters.grid.2 <- list(lambda=l2, nb.arch=2:6)
resC <- c3co(dat, stat="C1C2", parameters.grid.2)
resC@config$best
pneuvial commented 7 years ago

I can reproduce the problem.

First, note that the problem occurs also for total CN:

res <- c3co(dat, stat="C", parameters.grid.2)
res@config$best
seg <- res@segDat
Y <- t(seg$Y)
fit <- fitC3co(Y, parameters.grid=parameters.grid.2)
fit$config$best  ## PVE already <0
#   nb.feat lambda1     PVE      BIC Likelihood
# 1       2     0.1 -0.0001 258.6185  -97.81898
# 2       3     0.1 -0.0082 258.0129  -98.12180
# 3       4     0.1 -0.0029 258.4085  -97.92398
# 4       5     0.1 -0.0019 258.4824  -97.88705
# 5       6     0.1 -0.0056 258.2068  -98.02483

so we look at a single run of the core function positiveFusedLasso:

lambda <- parameters.grid.2$lambda
nb.arch <- 3
Z0 <- initializeZ(Y, nb.arch=nb.arch)
pfl <- positiveFusedLasso(Y1=Y, Y2=NULL, Z1=Z0$Z, Z2=NULL, lambda1=lambda, verbose=TRUE)
# delta:Inf
# delta:1.8011
# delta:0.5771
# (...)
# delta:0.0346
# delta:0.0458
# ERROR to solve inv(WtW):  Error in solve.default(t(W) %*% W): system is computationally singular: 
# reciprocal condition number = 5.40383e-17

# delta:0
# Stopped after 35 iterations
# delta:0

Next question is: why do we get a non invertible t(W) %*% W after 35 iterations? Perhaps easier to understand with purely simulated data ?

By the way why don't we throw an error in such a situation instead of only showing a message (if verbose is on?)

HenrikBengtsson commented 6 years ago

I'm trying to reproduce this (with the intercept branch), and I can reproduce the negative PVE for the (C1,C2) fit as well as the TCN file:

library("c3co")

dataAnnotTP <- acnr::loadCnRegionData(dataSet = "GSE11976", tumorFrac = 1)
dataAnnotN <- acnr::loadCnRegionData(dataSet = "GSE11976", tumorFrac = 0)
len <- 500 * 10
nbClones <- 3L
set.seed(88)
bkps <- list(c(100, 250) * 10, c(150, 400) * 10, c(150, 400) * 10)
regions <- list(c("(1,2)", "(0,2)", "(1,2)"),
        c("(0,3)", "(0,1)", "(1,1)"),
        c("(0,2)", "(0,1)", "(1,1)"))

datSubClone <- buildSubclones(len, nbClones = nbClones, bkps = bkps, regions = regions, dataAnnotTP = dataAnnotTP, dataAnnotN = dataAnnotN)
M <- rSparseWeightMatrix(15, nb.arch = 3L, sparse.coeff = 0.7)
dat <- mixSubclones(subClones = datSubClone, W = M)
parameters.grid.2 <- list(lambda = 0.1, nb.arch = 2:6)

## Fit (C1,C2)
resC <- c3co(dat, stat = "C1C2", parameters.grid = parameters.grid.2)

## Negative PVE
best <- resC@config$best
print(best)
#   nb.feat lambda1 lambda2           PVE        BIC   logLik       loss
# 1       2     0.1     0.1 -0.1461185539  -42.66732 17.77751 0.03644525
# 2       3     0.1     0.1 -0.0021044825  -67.85428 22.81297 0.03186577
# 3       4     0.1     0.1 -0.0004699732  -87.22176 22.87418 0.03181379
# 4       5     0.1     0.1 -0.0001594374  -95.84510 22.88582 0.03180392
# 5       6     0.1     0.1  0.0000000000 -149.80772 22.89180 0.03179885

## Fit TCN
res <- c3co(dat, stat = "C", parameters.grid = parameters.grid.2)
fit <- res@config$best
print(fit)
seg <- res@segDat
print(seg)

Y <- t(seg$Y)
fit <- fitC3co(Y, parameters.grid = parameters.grid.2)
best <- fit$config$best
print(best)
#   nb.feat lambda1           PVE        BIC   logLik       loss
# 1       2     0.1 -6.700515e-04  -39.73689 22.86668 0.03182016
# 2       3     0.1 -5.290286e-05  -74.25366 22.88982 0.03180053
# 3       4     0.1 -7.447136e-07 -106.63287 22.89178 0.03179887
# 4       5     0.1 -7.115587e-07 -134.69654 22.89178 0.03179887
# 5       6     0.1 -7.443347e-09 -156.28395 22.89180 0.03179885

However, I cannot get beyond the next part because code / arguments have changed:

lambda <- parameters.grid.2$lambda
nb.arch <- 3L

## Argument 'nb.arch' does not exist
Z0 <- initializeZ(Y, nb.arch = nb.arch)

## Arguments changed
pfl <- positiveFusedLasso(Y1=Y, Y2=NULL, Z1=Z0$Z, Z2=NULL, lambda1=lambda, verbose=TRUE)
HenrikBengtsson commented 5 years ago

UPDATE 2019-02-20: The following is how to do it with the current intercept branch:

res <- c3co(dat, stat="C", parameters.grid.2)
res@config$best
##   nb.feat lambda1 lambda2           PVE        BIC   logLik       loss
## 1       2     0.1     0.1 -0.1461185539  -42.66732 17.77751 0.03644525
## 2       3     0.1     0.1 -0.0021044825  -67.85428 22.81297 0.03186577
## 3       4     0.1     0.1 -0.0004699735  -87.22176 22.87418 0.03181379
## 4       5     0.1     0.1 -0.0001594374  -95.84510 22.88582 0.03180392
## 5       6     0.1     0.1  0.0000000000 -149.80772 22.89180 0.03179885

seg <- res@segDat
Y <- t(seg$Y)
fit <- fitC3co(Y, parameters.grid=parameters.grid.2)
fit$config$best  ## PVE already <0
##   nb.feat lambda1           PVE        BIC   logLik       loss
## 1       2     0.1 -6.700515e-04  -39.73689 22.86668 0.03182016
## 2       3     0.1 -1.299169e-06  -74.25173 22.89175 0.03179889
## 3       4     0.1 -4.418704e-06 -106.63301 22.89164 0.03179899
## 4       5     0.1 -1.390449e-05 -134.69703 22.89128 0.03179929
## 5       6     0.1 -4.536239e-08 -156.28396 22.89180 0.03179885

lambda <- parameters.grid.2$lambda
K <- 3L
Z0t <- initializeZt(Y, K = K)
pfl <- positiveFusedLasso(Y=list(Y1=Y), Zt=list(Z1=Z0t$Z), lambda=lambda, verbose=TRUE)

## delta:Inf
## delta:1.7986
## delta:2.1423
## delta:1.0154
## delta:1.2691
## delta:6e-04
## Stopped after 7 iterations
## delta:6e-04

In my case, the algorithm converges.

HenrikBengtsson commented 5 years ago

PVE is calculated in modelFitStatistics(Y, Yhat, What, Zhatt):

https://github.com/pneuvial/c3co/blob/4b93efcb466093d7181683717decf4cb30feae68/R/modelFitStats.R#L26-L29

HenrikBengtsson commented 5 years ago

Now, this is interesting. It looks like @jchiquet's fix of the lambda penality (Issue #57) "solved" the negative PVE scores. The BIC, logLik, and loss scores are also significantly different now. Looking at the git diff, it's not clear to me why this solve this. Another way to put it, it's still not clear to me how it was possible to obtain negative PVEs - could it be that we calculated on the transposed matrix? (don't think so though).

Q. What does "best" mean in resC@config$best? The presented fives scores are for nb.arch = 2:6), correct? There is no "best" selection among those, correct? In other words, can you confirm there is nothing strange in observing (PVE, BIC, logLik, loss) scores getting worse in the last step (nb.feat = 6).

library("c3co")

dataAnnotTP <- acnr::loadCnRegionData(dataSet = "GSE11976", tumorFrac = 1)
dataAnnotN <- acnr::loadCnRegionData(dataSet = "GSE11976", tumorFrac = 0)
len <- 500 * 10
nbClones <- 3L
set.seed(88)
bkps <- list(c(100, 250) * 10, c(150, 400) * 10, c(150, 400) * 10)
regions <- list(c("(1,2)", "(0,2)", "(1,2)"),
        c("(0,3)", "(0,1)", "(1,1)"),
        c("(0,2)", "(0,1)", "(1,1)"))

datSubClone <- buildSubclones(len, nbClones = nbClones, bkps = bkps, regions = regions, dataAnnotTP = dataAnnotTP, dataAnnotN = dataAnnotN)
M <- rSparseWeightMatrix(15, nb.arch = 3L, sparse.coeff = 0.7)
dat <- mixSubclones(subClones = datSubClone, W = M)
parameters.grid.2 <- list(lambda = 0.1, nb.arch = 2:6)

## Fit (C1,C2)
resC <- c3co(dat, stat = "C1C2", parameters.grid = parameters.grid.2)

## Negative PVE
best <- resC@config$best
print(best)

Negative PVE:s

## remotes::install_github("pneuvial/c3co@a65b3a38")  ## 2019-02-20 @ 12:39

resC <- c3co(dat, stat = "C1C2", parameters.grid = parameters.grid.2, verbose = TRUE)
## chr 1
## Joint segmentation
## fitC3co() ...
##  - Only one regularization parameter is provided. Using the same value for lambda[1] and lambda[2]: 0.1
##  - Iteration #1 (2 latent features) of 5 ...
##    + Parameter configuration: (lambda1, lambda2)
##    + configuration #1 (lambda1 = 0.1, lambda2 = 0.1) of 1: fused lasso, BIC = -42.6673*
##  - Iteration #1 (2 latent features) of 5 ... DONE
##  - Iteration #2 (3 latent features) of 5 ...
##    + Parameter configuration: (lambda1, lambda2)
##    + configuration #1 (lambda1 = 0.1, lambda2 = 0.1) of 1: fused lasso, BIC = -67.8543*
##  - Iteration #2 (3 latent features) of 5 ... DONE
##  - Iteration #3 (4 latent features) of 5 ...
##    + Parameter configuration: (lambda1, lambda2)
##    + configuration #1 (lambda1 = 0.1, lambda2 = 0.1) of 1: fused lasso, BIC = -87.2218*
##  - Iteration #3 (4 latent features) of 5 ... DONE
##  - Iteration #4 (5 latent features) of 5 ...
##    + Parameter configuration: (lambda1, lambda2)
##    + configuration #1 (lambda1 = 0.1, lambda2 = 0.1) of 1: fused lasso, BIC = -95.8451*
##  - Iteration #4 (5 latent features) of 5 ... DONE
##  - Iteration #5 (6 latent features) of 5 ...
##    + Parameter configuration: (lambda1, lambda2)
##    + configuration #1 (lambda1 = 0.1, lambda2 = 0.1) of 1: fused lasso, BIC = -149.808*
##  - Iteration #5 (6 latent features) of 5 ... DONE
## fitC3co() ... DONE

print(best)
##   nb.feat lambda1 lambda2           PVE        BIC   logLik       loss
## 1       2     0.1     0.1 -0.1461185539  -42.66732 17.77751 0.03644525
## 2       3     0.1     0.1 -0.0021044825  -67.85428 22.81297 0.03186577
## 3       4     0.1     0.1 -0.0004699735  -87.22176 22.87418 0.03181379
## 4       5     0.1     0.1 -0.0001594374  -95.84510 22.88582 0.03180392
## 5       6     0.1     0.1  0.0000000000 -149.80772 22.89180 0.03179885

No negative PVE:s

## remotes::install_github("pneuvial/c3co@410178f4")  ## 2019-02-20 @ 12:40

res <- c3co(dat, stat="C", parameters.grid.2, verbose = TRUE)
## chr 1
## Joint segmentation
## fitC3co() ...
##  - Only one regularization parameter is provided. Using the same value for lambda[1] and lambda[2]: 0.1
##  - Iteration #1 (2 latent features) of 5 ...
##    + Parameter configuration: (lambda1, lambda2)
##    + configuration #1 (lambda1 = 0.1, lambda2 = 0.1) of 1: fused lasso, BIC = 15.9957*
##  - Iteration #1 (2 latent features) of 5 ... DONE
##  - Iteration #2 (3 latent features) of 5 ...
##    + Parameter configuration: (lambda1, lambda2)
##    + configuration #1 (lambda1 = 0.1, lambda2 = 0.1) of 1: fused lasso, BIC = 16.3023*
##  - Iteration #2 (3 latent features) of 5 ... DONE
##  - Iteration #3 (4 latent features) of 5 ...
##    + Parameter configuration: (lambda1, lambda2)
##    + configuration #1 (lambda1 = 0.1, lambda2 = 0.1) of 1: fused lasso, BIC = 96.1693*
##  - Iteration #3 (4 latent features) of 5 ... DONE
##  - Iteration #4 (5 latent features) of 5 ...
##    + Parameter configuration: (lambda1, lambda2)
##    + configuration #1 (lambda1 = 0.1, lambda2 = 0.1) of 1: fused lasso, BIC = 90.1516*
##  - Iteration #4 (5 latent features) of 5 ... DONE
##  - Iteration #5 (6 latent features) of 5 ...
##    + Parameter configuration: (lambda1, lambda2)
##    + configuration #1 (lambda1 = 0.1, lambda2 = 0.1) of 1: fused lassoW is rank deficient. Removing a latent feature
## W is rank deficient. Removing a latent feature
## , BIC = 7.51021*
##  - Iteration #5 (6 latent features) of 5 ... DONE
## fitC3co() ... DONE

print(best)
##   nb.feat lambda1 lambda2       PVE      BIC    logLik         loss
## 1       2     0.1     0.1 0.8302391 15.99566  89.39296 0.0053982021
## 2       3     0.1     0.1 0.8874668 16.30232 104.81083 0.0035784250
## 3       4     0.1     0.1 0.9924783 96.16931 206.26526 0.0002391824
## 4       5     0.1     0.1 0.9937482 90.15165 213.20006 0.0001987999
## 5       6     0.1     0.1 0.9049181  7.51021 111.12992 0.0030234953

What was updated in the code

These two commits are immediately after each other:

$ git log
[...]
$ commit 410178f45274b836ddc8f042d81acfed680fdcd1
Merge: a65b3a3 027210e
Author: Pierre Neuvial <pneuvial@gmail.com>
Date:   Wed Feb 20 12:40:37 2019 +0100

    Merge branch 'intercept' of https://github.com/pneuvial/c3co into intercept

commit a65b3a38faa77104a5500a0e5e36021e4fc392b4
Author: Pierre Neuvial <pneuvial@gmail.com>
Date:   Wed Feb 20 12:39:32 2019 +0100

    Add first tests for positiveFusedLasso().
[...]
$ git diff a65b3a38 410178f4
diff --git a/R/get.Zt.R b/R/get.Zt.R
index b4d8bd5..cb08b00 100644
--- a/R/get.Zt.R
+++ b/R/get.Zt.R
@@ -38,6 +38,8 @@
 #' @importFrom matrixStats colCumsums
 #' @importFrom methods as
 get.Zt <- function(Y, lambda, W, WtWm1) {
+  
+  n <- nrow(Y)
   J <- ncol(Y)
   K <- ncol(W)

@@ -51,18 +53,14 @@ get.Zt <- function(Y, lambda, W, WtWm1) {
   W.WtWm1 <- W %*% WtWm1
   Pw <- W.WtWm1 %*% t(W)

-  ## FIXME: include the penalty factor in get.Zt
-  ## FIXME to be consistent with what is written,
-  ## FIXME lambda should be lambda / (2*n*J) when calling glmnet
-  
   ## Lasso regression 
   X.tilde <- kronecker(scale(X1, center = TRUE, scale = FALSE), as(W, "sparseMatrix"))
   ## Why as.numeric() below? Is it of a different type? /HB 2018-02-27
   y.tilde <- as.numeric(sweep(Y, MARGIN = 1L, STATS = rowMeans(Pw %*% Y), FUN = `-`)) 
-  z.tilde <- glmnet(X.tilde, y.tilde, lambda = lambda, intercept = FALSE, standardize = FALSE)$beta
+  z.tilde <- glmnet(X.tilde, y.tilde, lambda = lambda / (2*n*J), intercept = FALSE, standardize = FALSE)$beta

   ## Go back to Z 
-  Z <- matrix(z.tilde, nrow=J-1L, ncol=K, byrow=TRUE)
+  Z <- matrix(z.tilde, nrow = J-1L, ncol = K, byrow = TRUE)
   X1.Z <- rbind(0, colCumsums(Z))
HenrikBengtsson commented 5 years ago

Remaining task: