rwehrens / BatchCorrMetabolomics

Supplementary material for the paper "Improved batch correction in untargeted MS-based metabolomics" by R. Wehrens, et al. Metabolomics, 2016.
15 stars 9 forks source link

optimization failed to converge with censored regression #4

Open sneumann opened 6 years ago

sneumann commented 6 years ago

Hi Ron & Jos,

we're running into a numeric error with:

> data_qcCorr_cens <- t(apply(data_raw,1,doBC, 
+                      ref.idx = ref.idx,
+                      batch.idx = batch.idx, method = "tobit",
+                      seq.idx = seq.idx, imputeVal= 0))
optimization failed to convergeError in solve.default(hessfun(par)) : 
  system is computationally singular: reciprocal condition number = 1.16232e-16

(see below for the full traceback). least-square and robust regression compute without error, but censored regression fails as above. We unsure how to debug which parts of our input might be the cause, and whether it is really in the data or we passed the wrong inputs. Any hints appreciated.

Yours, Steffen and Pauline

> data_qcCorr_cens <- t(apply(data_raw,1,doBC, 
+                      ref.idx = ref.idx,
+                      batch.idx = batch.idx, method = "tobit",
+                      seq.idx = seq.idx, imputeVal= 0))
optimization failed to convergeError in solve.default(hessfun(par)) : 
  system is computationally singular: reciprocal condition number = 1.16232e-16
> traceback()
8: solve.default(hessfun(par))
7: solve(hessfun(par))
6: crch.fit(x = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
   1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
   1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
   1, 1, 1, 1, 1, 47, 56, 65, 75, 84, 95, 46, 57, 79, 9, 19, 28, 
   38, 47, 56, 65, 75, 84, 94, 35, 46, 57, 79, 9, 19, 28, 38, 54, 
   64, 73, 82, 91, 72, 83, 3, 15, 25, 35, 45, 53, 63, 72, 81, 90, 
   100, 93, 3, 8, 17, 34, 34, 44, 45, 55, 63, 73, 75, 86, 3, 11, 
   19, 26, 36, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 
   1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
   1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
   0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 
   1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 47, 56, 65, 
   75, 84, 94, 35, 46, 57, 79, 9, 19, 28, 38, 0, 0, 0, 0, 0, 0, 
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 54, 64, 73, 82, 
   91, 72, 83, 3, 15, 25, 35, 45, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 53, 63, 72, 81, 90, 100, 93, 
   3, 8, 17, 34, 34, 44, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
   0, 0, 0, 0, 0, 0, 0, 0, 45, 55, 63, 73, 75, 86, 3, 11, 19, 26, 
   36), y = c(`1` = 47767020.4625704, `2` = 47048430.1618492, `3` = 40174681.3284797, 
   `4` = 49768185.7993261, `5` = 52431812.6162573, `6` = 47598204.1115953, 
   `7` = 40232022.5181411, `8` = 41205575.1400985, `9` = 42474991.0899089, 
   `10` = 43324587.9140365, `11` = 34989161.9624934, `12` = 41144650.4617951, 
   `13` = 46487104.2827993, `14` = 67859790.0254081, `15` = 78260939.4374133, 
   `16` = 81870648.8167032, `17` = 53791872.0641073, `18` = 70952025.58833, 
   `19` = 81852470.3883439, `20` = 79975725.5089894, `21` = 80218732.9563392, 
   `22` = 80126485.503451, `23` = 79218809.7623864, `24` = 68634803.9262845, 
   `25` = 78842632.8077717, `26` = 75752762.4316779, `27` = 58509465.4099729, 
   `28` = 25164092.6989578, `29` = 27496597.3846646, `30` = 28675868.0128452, 
   `31` = 20258031.2343622, `32` = 26329894.7203263, `33` = 27425527.1856043, 
   `34` = 30284281.3805623, `35` = 32164261.9391609, `36` = 29123543.6772444, 
   `37` = 34317272.5614507, `38` = 33044588.0283511, `39` = 17572236.9234543, 
   `40` = 30671674.5017474, `41` = 29900006.652654, `42` = 29287002.7399043, 
   `43` = 30045543.3257243, `44` = 29388676.5817263, `45` = 29264053.1575898, 
   `46` = 32617439.0095218, `47` = 32214584.6102147, `48` = 33099076.0129338, 
   `49` = 32326551.7094354, `50` = 26777104.0497773, `51` = 31379858.6288193, 
   `52` = 31364786.3012565, `53` = 23248095.2654832, `54` = 23091217.9212516, 
   `55` = 22894425.6221877, `56` = 9322719.37892362, `57` = 24661747.4065606, 
   `58` = 25146951.0974302, `59` = 24826694.0235686, `60` = 24336889.2461576, 
   `61` = 24061861.9372801, `62` = 23913228.2959278, `63` = 23235197.4833093
   ), z = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
   1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
   1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
   1, 1, 1), left = 0, right = Inf, link.scale = "log", dist = "gaussian", 
       df = NULL, weights = c(`1` = 1, `2` = 1, `3` = 1, `4` = 1, 
       `5` = 1, `6` = 1, `7` = 1, `8` = 1, `9` = 1, `10` = 1, `11` = 1, 
       `12` = 1, `13` = 1, `14` = 1, `15` = 1, `16` = 1, `17` = 1, 
       `18` = 1, `19` = 1, `20` = 1, `21` = 1, `22` = 1, `23` = 1, 
       `24` = 1, `25` = 1, `26` = 1, `27` = 1, `28` = 1, `29` = 1, 
       `30` = 1, `31` = 1, `32` = 1, `33` = 1, `34` = 1, `35` = 1, 
       `36` = 1, `37` = 1, `38` = 1, `39` = 1, `40` = 1, `41` = 1, 
       `42` = 1, `43` = 1, `44` = 1, `45` = 1, `46` = 1, `47` = 1, 
       `48` = 1, `49` = 1, `50` = 1, `51` = 1, `52` = 1, `53` = 1, 
       `54` = 1, `55` = 1, `56` = 1, `57` = 1, `58` = 1, `59` = 1, 
       `60` = 1, `61` = 1, `62` = 1, `63` = 1), offset = list(location = c(0, 
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
       0, 0, 0, 0, 0), scale = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)), control = list(
           method = "BFGS", maxit = 5000, hessian = NULL, trace = FALSE, 
           start = NULL, dot = "separate", lower = -Inf, upper = Inf, 
           fnscale = 1, reltol = 9.02333028971155e-14), truncated = FALSE, 
       type = "ml")
5: do.call(fit, list(x = X, y = Y, z = Z, left = left, right = right, 
       link.scale = link.scale, dist = dist, df = df, weights = weights, 
       offset = offset, control = control, truncated = truncated, 
       type = type))
4: crch::crch(correctionFormula, data = fitdf, left = imputeVal)
3: FUN(newX[, i], ...)
2: apply(data_raw, 1, doBC, ref.idx = ref.idx, batch.idx = batch.idx, 
       method = "tobit", seq.idx = seq.idx, imputeVal = 0)
1: t(apply(data_raw, 1, doBC, ref.idx = ref.idx, batch.idx = batch.idx, 
       method = "tobit", seq.idx = seq.idx, imputeVal = 0))
rwehrens commented 6 years ago

Hi Steffen,

could you provide me a minimal example? One thing I notice is that you seem to be applying the batch correction row-wise, but that may have to do with the data (if your mass peaks are in the rows then your approach is correct). Second thing is that you seem to try to correct raw counts. I would take logs, definitely. Finally: it may happen that tobit regression fails to converge. Who knows you hit one of these cases. Again, happy to have a closer look.

Cheers, Ron