sneumann / xcms

This is the git repository matching the Bioconductor package xcms: LC/MS and GC/MS Data Analysis
Other
180 stars 80 forks source link

fillpeaks error after long processing time #542

Open cbroeckl opened 3 years ago

cbroeckl commented 3 years ago

i was running a set of samples with about 800 chromatograms, and the full xcms process took about two weeks. The xcms steps are wrapped in a function, and i save the incremental xcms objects to disk in case something happens. In this case the final fillPeaks step generated an error message:

Error in gzfile(file, mode) : cannot open the connection In addition: Warning message: In gzfile(file, mode) : cannot open compressed file 'C:\Users\pmflab\AppData\Local\Temp\RtmpaE3QsX\rstudio-available-packages-2d505022f51/time.rds', probable reason 'No such file or directory'

I had a saved version of the xcms object which I reloaded. I ran fill peaks manually and the following error message was returned:

> xdata <- fillChromPeaks(xdata, param = fpp, BPPARAM = mcpar) Defining peak areas for filling-in .... OK Start integrating peak areas from original files Error in file(con, "w") : cannot open the connection In addition: Warning message: In file(con, "w") : cannot open file 'C:\Users\pmflab\AppData\Local\Temp\RtmpaE3QsX\file2d503a3a19b4': No such file or directory

The thought of starting over was not appealing, so i tried creating this directory in the Temp\ folder and the process is now running (hopefully to completion). I am assuming that xcms at some point had created a tmp file in a tmp directory, which, due to the long processing time had been removed. I don't know if this is an XCMS issue or an R issue, but thought this a good place to report it.

`> sessionInfo() R version 4.0.2 (2020-06-22) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 17134)

Matrix products: default

locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C LC_TIME=English_United States.1252

attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base

other attached packages: [1] xcms_3.10.2 MSnbase_2.14.2 ProtGenerics_1.20.0 S4Vectors_0.26.1 mzR_2.22.0 Rcpp_1.0.5
[7] BiocParallel_1.22.0 Biobase_2.50.0 BiocGenerics_0.36.0

loaded via a namespace (and not attached): [1] SummarizedExperiment_1.18.2 tidyselect_1.1.0 purrr_0.3.4 lattice_0.20-41
[5] colorspace_2.0-0 vctrs_0.3.6 generics_0.1.0 vsn_3.56.0
[9] XML_3.99-0.5 rlang_0.4.9 pillar_1.4.7 glue_1.4.2
[13] affy_1.66.0 RColorBrewer_1.1-2 matrixStats_0.57.0 affyio_1.58.0
[17] GenomeInfoDbData_1.2.3 foreach_1.5.1 lifecycle_0.2.0 plyr_1.8.6
[21] mzID_1.26.0 robustbase_0.93-6 zlibbioc_1.34.0 munsell_0.5.0
[25] pcaMethods_1.82.0 gtable_0.3.0 codetools_0.2-18 IRanges_2.22.2
[29] doParallel_1.0.16 GenomeInfoDb_1.24.2 MassSpecWavelet_1.54.0 preprocessCore_1.52.0
[33] DEoptimR_1.0-8 scales_1.1.1 BiocManager_1.30.10 DelayedArray_0.14.1
[37] limma_3.44.3 XVector_0.28.0 RANN_2.6.1 impute_1.62.0
[41] ggplot2_3.3.2 digest_0.6.27 dplyr_1.0.2 ncdf4_1.17
[45] GenomicRanges_1.40.0 grid_4.0.2 tools_4.0.2 bitops_1.0-6
[49] magrittr_2.0.1 RCurl_1.98-1.2 tibble_3.0.4 crayon_1.3.4
[53] pkgconfig_2.0.3 Matrix_1.2-18 MASS_7.3-53 ellipsis_0.3.1
[57] rstudioapi_0.13 iterators_1.0.13 R6_2.5.0 MALDIquant_1.19.3
[61] compiler_4.0.2 `

jorainer commented 3 years ago

Thanks for reporting! xcms does not create any temporary files, so I guess it must be related to R. Could it be that in the time course of this 2 weeks R or some of the R packages got updated (automatically)?

cbroeckl commented 3 years ago

Thanks @joranier. There should have been no updates on this computer during this processing, but i will see if i can conform or refute that. I deliberately avoid trying to do much else on the computer when a large processing job is executing. I don't think i have any automatic updates occurring on R/Rstudio. There certainly could be something going on in the background with windows, but again, i don't think our IT management touches R/Rstudio. Good to know i should be able to rule out anything XCMS specific. thanks for the clarification and suggestion.

stanstrup commented 3 years ago

Seems to be RStudio related. Google finds others with similar issues. Update Rstudio?

"800 chromatograms, and the full xcms process took about two weeks." <-- that sounds really crazy to me unless you run out of memory...

cbroeckl commented 3 years ago

@stanstrup - 'crazy' is about right. This computer seems a bit sluggish these days, but i don't think it is a memory issue - i keep an eye on that. It is an older computer, probably about 5 years old, but has been updated to an SSD harddrive and has 64GB memory, which windows reports as being sufficient for the 4 cores it typically operates with. The data processing has gotten slower, and i do suspect it is some odd combination of (1) old-ish hardware, (2) larger datafiles from a faster instrument (we are collecting at twice the frequency since we have increased sensitivity after upgrading our instrument), and (3) windows and whatever comes with it. I did find a stackoverflow thread that listed windows antivirus as possibly playing a role, among a wealth of other options.... (lots of comments)

cbroeckl commented 3 years ago

My current best theory is here: https://stackoverflow.com/questions/51935894/local-shiny-app-crashes-after-temporary-directory-gets-deleted Windows Disk Cleanup seems the most likely culprit. Unfortunately, this is an error that i can't troubleshoot quickly, as it just takes a long processing queue to trigger. I do now know that updating R and Rstudio did not resolve the problem. i have a new error message on a processing job, disabled Disk Cleanup, and will see what happens. I am not sure that this test will be definitive, as i could not find any clear schedule that Disk Cleanup runs on, so i do not know how long a job must proceed before the offenending temp directory gets removed improperly.