Closed lee-t closed 6 years ago
The error message points towards a memory problem. Could you please provide the output of sessionInfo
on Windows and Linux to check what versions of R, xcms, mzR and MSnbase you are using? Also, what's the size of the memory you have available on Windows and on Linux?
Windows: 192GB physical memory
> memory.size()
[1] 3414.02
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 6610205 353.1 14442815 771.4 14442815 771.4
Vcells 180772895 1379.2 520945268 3974.5 565049686 4311.0
> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] parallel tools stats graphics grDevices utils datasets methods base
other attached packages:
[1] snow_0.4-2 rsm_2.9 CAMERA_1.34.0 xcms_3.0.0
[5] MSnbase_2.4.2 ProtGenerics_1.10.0 mzR_2.12.0 Rcpp_0.12.15
[9] BiocParallel_1.12.0 Biobase_2.38.0 BiocGenerics_0.24.0
loaded via a namespace (and not attached):
[1] lattice_0.20-35 digest_0.6.15 foreach_1.4.4 plyr_1.8.4
[5] backports_1.1.2 acepack_1.4.1 mzID_1.16.0 stats4_3.4.3
[9] ggplot2_2.2.1 BiocInstaller_1.28.0 pillar_1.2.1 zlibbioc_1.24.0
[13] rlang_0.2.0 lazyeval_0.2.1 rstudioapi_0.7 data.table_1.10.4-3
[17] S4Vectors_0.16.0 rpart_4.1-13 Matrix_1.2-12 checkmate_1.8.5
[21] preprocessCore_1.40.0 splines_3.4.3 stringr_1.3.0 foreign_0.8-69
[25] htmlwidgets_1.0 igraph_1.2.1 munsell_0.4.3 compiler_3.4.3
[29] pkgconfig_2.0.1 base64enc_0.1-3 multtest_2.34.0 pcaMethods_1.70.0
[33] htmltools_0.3.6 nnet_7.3-12 tibble_1.4.2 gridExtra_2.3
[37] htmlTable_1.11.2 RANN_2.5.1 Hmisc_4.1-1 IRanges_2.12.0
[41] codetools_0.2-15 XML_3.98-1.10 MASS_7.3-49 grid_3.4.3
[45] MassSpecWavelet_1.44.0 RBGL_1.54.0 gtable_0.2.0 affy_1.56.0
[49] magrittr_1.5 scales_0.5.0 graph_1.56.0 stringi_1.1.6
[53] impute_1.52.0 affyio_1.48.0 doParallel_1.0.11 limma_3.34.9
[57] latticeExtra_0.6-28 Formula_1.2-2 RColorBrewer_1.1-2 iterators_1.0.9
[61] LOBSTAHS_1.4.0 survival_2.41-3 colorspace_1.3-2 cluster_2.0.6
[65] vsn_3.46.0 MALDIquant_1.17 knitr_1.20
Linux: 64GB physical memory
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 4097561 218.9 6861544 366.5 6861544 366.5
Vcells 14170591 108.2 32215028 245.8 54292152 414.3
> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.4 LTS
Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel tools stats graphics grDevices utils datasets methods base
other attached packages:
[1] shiny_1.0.5 snow_0.4-2 rsm_2.9 CAMERA_1.34.0 xcms_3.0.0 MSnbase_2.4.1
[7] ProtGenerics_1.10.0 mzR_2.12.0 Rcpp_0.12.14 BiocParallel_1.12.0 Biobase_2.38.0 BiocGenerics_0.24.0
loaded via a namespace (and not attached):
[1] vsn_3.46.0 tidyr_0.7.2 splines_3.4.3 foreach_1.4.4 Formula_1.2-2
[6] assertthat_0.2.0 affy_1.56.0 stats4_3.4.3 latticeExtra_0.6-28 RBGL_1.54.0
[11] impute_1.52.0 backports_1.1.2 lattice_0.20-35 glue_1.2.0 limma_3.34.4
[16] digest_0.6.13 RColorBrewer_1.1-2 checkmate_1.8.5 colorspace_1.3-2 httpuv_1.3.5
[21] htmltools_0.3.6 preprocessCore_1.40.0 Matrix_1.2-11 plyr_1.8.4 MALDIquant_1.17
[26] XML_3.98-1.9 pkgconfig_2.0.1 zlibbioc_1.24.0 xtable_1.8-2 purrr_0.2.4
[31] scales_0.5.0 RANN_2.5.1 affyio_1.48.0 htmlTable_1.11.0 tibble_1.3.4
[36] IRanges_2.12.0 ggplot2_2.2.1 nnet_7.3-12 lazyeval_0.2.1 MassSpecWavelet_1.44.0
[41] mime_0.5 survival_2.41-3 magrittr_1.5 doParallel_1.0.11 MASS_7.3-49
[46] foreign_0.8-69 graph_1.56.0 BiocInstaller_1.28.0 data.table_1.10.4-3 stringr_1.2.0
[51] S4Vectors_0.16.0 munsell_0.4.3 cluster_2.0.6 bindrcpp_0.2 pcaMethods_1.70.0
[56] compiler_3.4.3 mzID_1.16.0 rlang_0.1.4 grid_3.4.3 iterators_1.0.9
[61] rstudioapi_0.7 htmlwidgets_0.9 igraph_1.1.2 base64enc_0.1-3 gtable_0.2.0
[66] codetools_0.2-15 multtest_2.34.0 R6_2.2.2 gridExtra_2.3 knitr_1.17
[71] dplyr_0.7.4 bindr_0.1 Hmisc_4.1-0 stringi_1.1.6 rpart_4.1-13
[76] acepack_1.4.1
Could you please update the xcms
and MSnbase
packages on both linux and windows? We recently fixed a memory problem in xcms
- could well be that this fixes also your problem.
Hi, I have the exact same error. I am running XCMS on mzML files centroided using ProteoWizard's msconvert.
This is my code:
raw_data <- readMSData(files = our_files, pdata = new("NAnnotatedDataFrame", meta_data),
mode = "onDisk")
cwp <- CentWaveParam(peakwidth = c(30, 80), noise = 1000)
xdata <- findChromPeaks(raw_data, param = cwp)
xdata <- adjustRtime(xdata, param = ObiwarpParam(gapInit = 2.86,
gapExtend = 2.268))
pdp <- PeakDensityParam(sampleGroups = xdata$sample_group,
minFraction = 0.1,
bw = 0.25,
minSamples = 1)
xdata <- groupChromPeaks(xdata, param = pdp)
xdata <- fillChromPeaks(xdata)
This is the error I get:
Requesting 3983 missing peaks from QT_170404_43.mzML ... got 3945.
Requesting 4020 missing peaks from QT_170404_46.mzML ...
Error: BiocParallel errors
element index: 2, 3, 4, 5
first error: result would be too long a vector
In addition: Warning message:
stop worker failed:
'clear_cluster' receive data failed:
reached elapsed time limit
I get the same error when explicitly calling SnowParam() and MulticoreParam(). When using serialParam(), I get the following error:
Requesting 3952 missing peaks from QT_170404_15.mzML ... got 3893.
Requesting 4028 missing peaks from QT_170404_16.mzML ... got 3968.
Requesting 4001 missing peaks from QT_170404_17.mzML ... Error in 1:(scanrange[1] - 1) : result would be too long a vector
In addition: Warning messages:
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to max; returning -Inf
sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: OS X El Capitan 10.11.6
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] xcms_3.2.0 MSnbase_2.6.0 ProtGenerics_1.12.0 mzR_2.14.0 Rcpp_0.12.16
[6] BiocParallel_1.14.1 Biobase_2.40.0 BiocGenerics_0.26.0 reshape2_1.4.3 XML_3.98-1.11
[11] BiocInstaller_1.30.0 norm_1.0-9.5
loaded via a namespace (and not attached):
[1] RColorBrewer_1.1-2 compiler_3.5.0 pillar_1.2.2 plyr_1.8.4 iterators_1.0.9
[6] tools_3.5.0 zlibbioc_1.26.0 digest_0.6.15 MALDIquant_1.17 tibble_1.4.2
[11] preprocessCore_1.42.0 gtable_0.2.0 lattice_0.20-35 rlang_0.2.0 Matrix_1.2-14
[16] foreach_1.4.4 yaml_2.1.19 stringr_1.3.0 IRanges_2.14.6 S4Vectors_0.18.1
[21] multtest_2.36.0 stats4_3.5.0 grid_3.5.0 impute_1.54.0 survival_2.42-3
[26] RANN_2.5.1 limma_3.36.1 ggplot2_2.2.1 magrittr_1.5 splines_3.5.0
[31] scales_0.5.0 pcaMethods_1.72.0 codetools_0.2-15 MASS_7.3-50 MassSpecWavelet_1.46.0
[36] mzID_1.18.0 colorspace_1.3-2 stringi_1.2.2 affy_1.58.0 doParallel_1.0.11
[41] lazyeval_0.2.1 munsell_0.4.3 vsn_3.48.0 affyio_1.50.0
gc()
used (Mb) gc trigger (Mb) limit (Mb) max used (Mb)
Ncells 4948588 264.3 8881278 474.4 NA 8881278 474.4
Vcells 35158500 268.3 106410195 811.9 32768 106409897 811.9
Could it be a memory problem? I don't have easy access in to a Windows PC, unfortunately, so I would love to figure out a fix for OSX.
Thank you @emmagraham for your detailed error description! I will have a look at it.
@emmagraham , could you please do some tests for me?
1) before you do fillChromPeaks
, could you please run any(is.na(chromPeaks(xdata)))
and post the result?
2) could you please install the latest xcms (using devtools::install_github("sneumann/xcms", ref = "master")
and test with that? This will hopefully help narrowing down from where the error comes.
@jotsetung Thanks for your help! I installed the latest version of XCMS from your repo, as you suggested. Here is my session info:
sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: OS X El Capitan 10.11.6
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] xcms_3.3.1 devtools_1.13.5 MSnbase_2.6.0 ProtGenerics_1.12.0 mzR_2.14.0
[6] Rcpp_0.12.16 BiocParallel_1.14.1 Biobase_2.40.0 BiocGenerics_0.26.0 reshape2_1.4.3
[11] XML_3.98-1.11 BiocInstaller_1.30.0 norm_1.0-9.5
loaded via a namespace (and not attached):
[1] splines_3.5.0 lattice_0.20-35 colorspace_1.3-2 snow_0.4-2 stats4_3.5.0
[6] yaml_2.1.19 vsn_3.48.0 survival_2.42-3 rlang_0.2.0 pillar_1.2.2
[11] withr_2.1.2 affy_1.58.0 RColorBrewer_1.1-2 affyio_1.50.0 foreach_1.4.4
[16] plyr_1.8.4 mzID_1.18.0 stringr_1.3.0 zlibbioc_1.26.0 munsell_0.4.3
[21] pcaMethods_1.72.0 gtable_0.2.0 codetools_0.2-15 memoise_1.1.0 knitr_1.20
[26] IRanges_2.14.6 doParallel_1.0.11 curl_3.2 MassSpecWavelet_1.46.0 preprocessCore_1.42.0
[31] scales_0.5.0 limma_3.36.1 S4Vectors_0.18.1 RANN_2.5.1 impute_1.54.0
[36] ggplot2_2.2.1 digest_0.6.15 stringi_1.2.2 grid_3.5.0 tools_3.5.0
[41] magrittr_1.5 lazyeval_0.2.1 tibble_1.4.2 MASS_7.3-50 Matrix_1.2-14
[46] httr_1.3.1 iterators_1.0.9 R6_2.2.2 MALDIquant_1.17 multtest_2.36.0
[51] compiler_3.5.0 git2r_0.21.0
test 1
any(is.na(chromPeaks(xdata)))
[1] FALSE
I also tried using the fillChromPeaks function, and got the following error:
xdata <- fillChromPeaks(xdata, BPPARAM = p)
Requesting 3983 missing peaks from QT_170404_43.mzML ... got 3945.
Requesting 4020 missing peaks from QT_170404_46.mzML ...
Error: BiocParallel errors
element index: 2, 3, 4, 5
first error: 'scanrange' does not contain finite values
In addition: Warning message:
stop worker failed:
'clear_cluster' receive data failed:
reached elapsed time limit
Thanks, now we're getting closer. I think the problem is that you have peaks with retention times that are outside the retention time range for certain files. I think I fixed this now. Could you please install xcms
again (from github) and retry?
Sorry for the delay - with the latest version of XCMS from github, I still get the exact same error.
So, I tried swapping out my input mzML data for centroided mzData files from another experiment (with exact same parameters, code etc), and everything ran without errors.
This made me suspicious that something may be wrong with my input files, so I ran my code for XCMS v1.48.0 (provided below) on the mzML files I am currently trying to analyze, and was able to run XCMS without errors. My code for v1.48.0:
xset <- xcmsSet(our_files,
method = "centWave",
ppm = 15,
peakwidth = c(3,35.75),
mzdiff = 0.00325,
prefilter = c(3, 100),
noise = 0,
snthresh = 2.8)
#group peaks together
xset <- group(xset)
#retention time correction
xset2 <- retcor(xset,
method = "obiwarp",
profStep = 1,
gapInit = 2.86,
gapExtend = 2.268)
#group again
xset2 <- group(xset2,
bw = 0.25,
mzwid = 0.02122,
minfrac = 0.1,
minsamp = 1)
xset3 <- fillPeaks(xset2)
gt <- xcms::groups(xset3)
intensity_matrix <- groupval(xset3, "medret", "into")
However, I do get more than 50 warnings, all about specific features being out of RT range. An example of a warning:
In .local(object, ...) :
getPeaks: Peak m/z:107.068016052246-107.069328308105, RT:2.5255-12.37250000002is out of retention time range for this sample (/Users/emmagraham/Desktop/Masters/Metabolomics project/metabolomics_repo/Controls/Files_pos/QT_170328_15.mzML), using zero intensity value.
This provides additional evidence that the RT range is where the current version is getting tripped up.
Session info:
> sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: OS X El Capitan 10.11.2
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] xcms_1.48.0 Biobase_2.32.0 ProtGenerics_1.4.0 BiocGenerics_0.18.0
[5] mzR_2.6.3 Rcpp_0.12.16 readr_1.1.1 XML_3.98-1.10
loaded via a namespace (and not attached):
[1] bindr_0.1.1 magrittr_1.5 hms_0.4.2 lattice_0.20-35 R6_2.2.2
[6] rlang_0.2.0 dplyr_0.7.4 tools_3.4.4 grid_3.4.4 yaml_2.1.18
[11] assertthat_0.2.0 tibble_1.4.2 bindrcpp_0.2.2 RColorBrewer_1.1-2 codetools_0.2-15
[16] glue_1.2.0 compiler_3.4.4 pillar_1.2.1 pkgconfig_2.0.1
Some info about my current files that I may have not mentioned earlier: they were converted from Agilent .d files to mzML files using msconvert. Through msconvert, the files were centroided (using peak_picking = TRUE argument) and compressed using zlib. Have there been any changes in how the new version of XCMS deals with RT being out of the scan range in mzML files?
Thanks for testing. Sorry that it didn't work out (I was sure it would). Would it be possible for you to break the failing experiment down to, say, 2 files and share these with me? This would enable me to debug and fix the problem locally.
Thanks! I've sent an email to your EURAC account.
@emmagraham , can you please install again the most recent xcms version from github and retry? Note, you will have to restart R after installing xcms.
devtools::install_github("sneumann/xcms", ref = "master")
This fixed the error. Thank you! Out of curiousity, what was the problem?
On Fri, May 18, 2018 at 5:19 AM, Johannes Rainer notifications@github.com wrote:
@emmagraham https://github.com/emmagraham , can you please install again the most recent xcms version from github and retry? Note, you will have to restart R after installing xcms.
devtools::install_github("sneumann/xcms", ref = "master")
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/sneumann/xcms/issues/267#issuecomment-390189141, or mute the thread https://github.com/notifications/unsubscribe-auth/AMMitwfC1xcy5nTAWdtHmqf7o53UqTxlks5tzrxBgaJpZM4SuXcw .
-- Emma Graham, BSc Graduate Student in Bioinformatics Mostafavi Lab Centre for Molecular Medicine and Therapeutics (CMMT) | University of British Columbia
I added an additional check that ensures the retention time ranges to be within the boundaries. Somehow there seem to have sneaked NA
values through. Thanks for testing!
@lee-t , could you eventually also test the new version? I guess this might fix also your problem - and then we could close this issue.
Memory issues regarding fillchrompeaks()
have been working fine, even in parallel, on the latest master branch of xcms. Feel free to close
Hi, I'm trying to debug a script that processes lipid samples. After peakpicking, 2 rounds of grouping and RT, the fillChrom peak step fails on a Ubuntu machine running MulticoreParam.
Running it in serialparam also seems to fail.
This seems relatively inconsistent since it does work on a Windows PC with SnowParam()