Closed lauzikaite closed 6 years ago
I have just noticed that this error only occurs if files have no peaks detected with the specified parameters. Below is the output for optimizeXcmsSet()
with 5 parameters to optimize for 15 files. First DoE was performed succesfully, but the second DoE crashed:
starting new DoE with: min_peakwidth: c(3, 7) max_peakwidth: c(7.5, 12.5) ppm: c(22.5, 27.5) mzdiff: c(-0.0065, 0.0045) noise: c(150, 450) snthresh: 10 prefilter: 8 value_of_prefilter: 6000 mzCenterFun: wMean integrate: 2 fitgauss: FALSE verbose.columns: FALSE
1
...
22 Detecting mass traces at 27.5 ppm ... OK Detecting chromatographic peaks in 345 regions of interest ... FAIL: none found! Detecting mass traces at 27.5 ppm ... OK Detecting chromatographic peaks in 374 regions of interest ... FAIL: none found! Detecting mass traces at 27.5 ppm ... OK Detecting chromatographic peaks in 383 regions of interest ... FAIL: none found! Detecting mass traces at 27.5 ppm ... OK Detecting chromatographic peaks in 389 regions of interest ... FAIL: none found! Detecting mass traces at 27.5 ppm ... OK Detecting chromatographic peaks in 376 regions of interest ... FAIL: none found! Detecting mass traces at 27.5 ppm ... OK Detecting chromatographic peaks in 363 regions of interest ... FAIL: none found! Detecting mass traces at 27.5 ppm ... OK Detecting chromatographic peaks in 399 regions of interest ... FAIL: none found! Detecting mass traces at 27.5 ppm ... OK Detecting chromatographic peaks in 390 regions of interest ... FAIL: none found! Detecting mass traces at 27.5 ppm ... OK Detecting chromatographic peaks in 381 regions of interest ... FAIL: none found! Detecting mass traces at 27.5 ppm ... OK Detecting chromatographic peaks in 369 regions of interest ... FAIL: none found! Detecting mass traces at 27.5 ppm ... OK Detecting chromatographic peaks in 389 regions of interest ... FAIL: none found! Detecting mass traces at 27.5 ppm ... OK Detecting chromatographic peaks in 382 regions of interest ... FAIL: none found! Detecting mass traces at 27.5 ppm ... OK Detecting chromatographic peaks in 348 regions of interest ... OK: 1 found. Detecting mass traces at 27.5 ppm ... OK Detecting chromatographic peaks in 434 regions of interest ... FAIL: none found! Detecting mass traces at 27.5 ppm ... OK Detecting chromatographic peaks in 427 regions of interest ... FAIL: none found! Error in peaks(xset)[, c("mz", "rt", "sample", "into", "mzmin", "mzmax", : subscript out of bounds Calls: system.time ... xcmsSetExperimentsCluster -> sapply -> lapply -> FUN -> calcPPS
Hi!
Thank you for reporting your error. Sorry also for the late response, as I was out of office.
Can you also please report your session sessionInfo()
, so that I know the used package versions. Then I'll look into it.
Can you also share the data with me or otherwise reproduce the error with public available data?
Hi!
Thank you for your reply.
Data is located on Dropbox. Same error has been produced with multiple datasets from my study and I uploaded just ten QC samples here.
sessionInfo()
R version 3.4.0 (2017-04-21) Platform: x86_64-pc-linux-gnu (64-bit) Running under: SUSE Linux Enterprise Server 12 SP1
Matrix products: default BLAS: /apps/R/3.4.0/lib64/R/lib/libRblas.so LAPACK: /apps/R/3.4.0/lib64/R/lib/libRlapack.so
locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base
other attached packages: [1] IPO_1.2.0 CAMERA_1.32.0 rsm_2.8 [4] xcms_1.52.0 MSnbase_2.2.0 ProtGenerics_1.8.0 [7] mzR_2.10.0 Rcpp_0.12.11 BiocParallel_1.10.1 [10] Biobase_2.36.2 BiocGenerics_0.22.0
loaded via a namespace (and not attached): [1] vsn_3.44.0 splines_3.4.0 foreach_1.4.3 [4] Formula_1.2-1 affy_1.54.0 stats4_3.4.0 [7] latticeExtra_0.6-28 RBGL_1.52.0 impute_1.50.1 [10] backports_1.1.0 lattice_0.20-35 limma_3.32.2 [13] digest_0.6.12 RColorBrewer_1.1-2 checkmate_1.8.2 [16] colorspace_1.3-2 sandwich_2.3-4 htmltools_0.3.6 [19] preprocessCore_1.38.1 Matrix_1.2-10 plyr_1.8.4 [22] MALDIquant_1.16.2 XML_3.98-1.9 zlibbioc_1.22.0 [25] xtable_1.8-2 mvtnorm_1.0-6 scales_0.4.1 [28] RANN_2.5.1 affyio_1.46.0 lsmeans_2.26-3 [31] htmlTable_1.9 tibble_1.3.3 IRanges_2.10.2 [34] ggplot2_2.2.1 TH.data_1.0-8 nnet_7.3-12 [37] lazyeval_0.2.0 MassSpecWavelet_1.42.0 survival_2.41-3 [40] magrittr_1.5 estimability_1.2 doParallel_1.0.10 [43] nlme_3.1-131 MASS_7.3-47 foreign_0.8-69 [46] graph_1.54.0 BiocInstaller_1.26.0 tools_3.4.0 [49] data.table_1.10.4 multcomp_1.4-6 stringr_1.2.0 [52] S4Vectors_0.14.3 munsell_0.4.3 cluster_2.0.6 [55] pcaMethods_1.68.0 compiler_3.4.0 mzID_1.14.0 [58] rlang_0.1.1 grid_3.4.0 iterators_1.0.8 [61] htmlwidgets_0.8 igraph_1.0.1 base64enc_0.1-3 [64] gtable_0.2.0 codetools_0.2-15 multtest_2.32.0 [67] gridExtra_2.2.1 zoo_1.8-0 knitr_1.16 [70] Hmisc_4.0-3 stringi_1.1.5 rpart_4.1-11 [73] acepack_1.4.1 coda_0.19-1
I do use a modified plotContours() funtion to prevent jpeg generation, since the Linux cluster does not support graphical display. It shouldn't have any effect on the overall process though.
source("/.../plotContours_1.R")
tmpfun <- get("plotContours", envir = asNamespace("IPO"))
environment(plotContours_1) <- environment(tmpfun)
attributes(plotContours_1) <- attributes(tmpfun)
assignInNamespace("plotContours", plotContours_1, ns="IPO")
plotContours_1.R looks like that:
plotContours_1 <- function(model, maximum_slice, plot_name = NULL) {
print('Modified plotContours') # To check whether v2 was re-assigned to the package
plots <- c()
for(i in 1:(length(maximum_slice)-1)) {
for(j in (i+1):length(maximum_slice)) {
plots <- c(plots, as.formula(paste("~ x", i, "* x", j, sep="")))
}
}
# determine number of plot rows and column on single device
plot_rows <- round(sqrt(length(plots)))
plot_cols <-
if(plot_rows==1){
length(plots)
} else {
ceiling(sqrt(length(plots)))
}
# save as jpeg, if plot_name is given
if(!is.null(plot_name)) {
plot_name = paste(plot_name, ".jpg", sep="")
# jpeg(plot_name, width=4*plot_cols, height=2*plot_rows+2,
# units="in", res=c(200,200))
} # otherwise plot on device
# op <- par(mfrow = c(plot_rows, plot_cols), oma = c(0,0,0,0))
# contour.lm is called
# contour(model, plots, image = TRUE, at = maximum_slice)
if (!is.null(plot_name)) {
# dev.off()
}
# par(op)
}
Thank you for the data and the sessionInfo()
.
The issue seems to arise in the rare case, when there is only one (or just few) peak(s) found. In this case the xcmsSet
-object does not hold a sample
-argument, which is expected by IPO. This might be a bug in the xcms-package, which I'm not sure yet. I'll have to have a deeper look into it to find out.
So the problem was in the xcms
-package (see sneumann/xcms#220). An update to the newest xcms3 branch from github should resolve the problem (see also sneumann/xcms#220). However I'll add a workaround in IPO
to also work with older xcms
-versions.
Thank you for looking into this, I will try running IPO with newest xcms3 branch.
According to my tests the develop-branch (see 2abc39edae7d7572e5369957a59dccd9715946ab) should now handle this issue correctly also for older xcms
-versions.
Brilliant, the develop-branch works just fine with the old xcms. xcms3 with the IPO Bioconductor-version works too.
Also, my suggestion for a new version is to provide an option to disable jpeg generation, which other users might find useful too :)
Thanks for your help!
Thank you very much for testing. The bug-fixes are on the way to Bioconductor as well.
Regarding the jpeg generateion: Are you aware of setting subdir = NULL
to plot to the graphics device? Or are you asking for an option to completely disable plotting?
Yes, an option to completely disable plotting would be useful for those that are using linux clusters, which do not support graphical display.
Thanks for the suggestion. I guess there's no reason against providing an option to disable plotting.
I'm trying to optimize xcmsSet parameters for a large study by taking subsets of the data (10 - 20 mzML files) as representatives of different LC-MS modes and running them separately.
optimizeXcmsSet()
does work with some of the data subsets just fine. But with some datasets, function stops after a few DoE with an error:I tried running the code with the same mzMLs on two systems:
Code below is for running on x86_64-pc-linux-gnu cluster:
Have you encountered such issue before? I don't understand why this issue occurs, since function works with the same, but fewer files (e.g., 2 files, 5 files or 8 files out of 10 files). Furthermore, these mzML files are processed succesfully with
xcmsSet()
and are not corrupted in any way.Thank you.