Closed HongxiangXu closed 1 year ago
RAM and CPUs aren't the limiting factor when reading data from disk - disk access is.
Please also read the section about in-memory and on-disk backends, where the former has a major impact on RAM requirements.
Finally, please do consider using Spectra for all you raw data manipulation needs. More on the R for Mass Spectrometry initiative (which Spectra is part of) here - https://rformassspectrometry.github.io/docs/
I have successfully run example of mzML from MSnbase packages through readMSData in just 1 second. I saw this example file was very small (0.18Mb).
However it takes more than 30min to read my mzML file (around 900Mb). When I use other packages to read in my mzML it was also quick, but certainly not meet the need of formation of MSnbase to do further analysis such as quantification.
quantFile <- list.files("ccms_peak", pattern="mzML",full.names=TRUE, recursive = TRUE)
msexp <- readMSData(quantFile[1], verbose = FALSE)
My R server had 48 thread and more than 800GB RAM. How could I accelerate this function?