sneumann / xcms

This is the git repository matching the Bioconductor package xcms: LC/MS and GC/MS Data Analysis
Other
178 stars 81 forks source link

Error: negative extents to matrix #569

Closed a1p6ca closed 3 years ago

a1p6ca commented 3 years ago

Hi Neumann,

I encountered an error while extracting peaks for a large file (about 29 GB, not in centroid pattern) by findChromPeaks(),

Here are my R script:

setwd('E:/test/202106')
files <- list.files(pattern = 'mzML')

raw_data <- readMSData(files = files[1],
                       msLevel. = 1,
                       mode = 'onDisk',
                       centroided. = F)

chr <- raw_data %>% 
  pickPeaks() %>%  
  findChromPeaks(param = CentWaveParam(snthresh = 3,
                                      ppm = 100,
                                      noise = 1e4))

Error information:

Error in rbind(...) : negative extents to matrix

When I try to plot a TIC figure, it failed after a few minutes.

Script:

plot(chromatogram(raw_data))

Error:

Error in (function (cond)  : 
  error in evaluating the argument 'x' in selecting a method for function 'plot': negative extents to matrix

I am not sure whether it was related to it's large size or profile pattern.

Many thanks!

stanstrup commented 3 years ago

I am not sure about the cause of the error but centwave should only be used with centroided data. So either centroid the data or use matchedfilter.

Btw. I believe the error is because of the crazy size of the data and at some point it overflows: https://stackoverflow.com/questions/51626554/merge-multiple-matrices-row-wise-and-store-it-in-a-new-matrix

a1p6ca commented 3 years ago

Thanks @stanstrup . I have tried matchedfilter, but it generated an error message: Error: cannot allocate vector of size 19.1 Mb

It seems that pickPeaks() may work on profile data because warning messages disappeared.

Now I am trying filterMz() to split rawdata then bind the results, but it always crashed at an unpredictable point. Error in unserialize(socklist[[n]]) : error reading from connection
Error in serialize(data, node$con) : error writing to connection

Is it possible to convert the profile data into centroid pattern?

jorainer commented 3 years ago

It seems that your R runs out of memory with your data. Are you using a 64bit version of R (on Windows? - the mac or linux version is always 64 bit, but for Windows there is a 32bit and a 64 bit binary available)?

Regarding the centroiding (convert from profile mode to centroid mode), you can do that either with Proteowizards msconvert tool or you could also do that with MSnbase directly in R. See eventually here or also here.

a1p6ca commented 3 years ago

@jorainer Thanks for your detailed suggestion, it is really helpful to me! MSconvert worked well, and the extraction of peaks finished just in half a minute now!