rickhelmus / patRoon

Workflow solutions for mass-spectrometry based non-target analysis.
https://rickhelmus.github.io/patRoon/
GNU General Public License v3.0
59 stars 18 forks source link

generateFormulas() and generate Compounds() errors #58

Closed SivanXW closed 1 year ago

SivanXW commented 1 year ago

Hi I'm trying to use patRoon to analyze my own data (~3GB mzML) following the tutorial and met some problems.

1) When running the generateFormulas() command, formulas <- generateFormulas(fGroups, mslists, "genform", relMzDev = 5, adduct = "[M+H]+", elements = "CHNOPCl", oc = FALSE, calculateFeatures = TRUE, featThresholdAnn = 0.75,absAlignMzDev = 0.0001) the progress bars stopped ramdomly (from 0 to 70%). I tried severaly times using the same data and command, and found that sometimes it worked smoothly. I wonder whether this is a issue of genform or patRoon and how to fixe it?

2) When running the generateCompounds() command, compounds <- generateCompounds(fGroups, mslists, "metfrag", dbRelMzDev = 5, fragRelMzDev = 5, fragAbsMzDev = 0.00005, adduct = "[M+H]+", database = "pubchem", maxCandidatesToStop = 2500) the program broken with "There is insufficient memory for the Java Runtime Environment to continue." with a .log files as followed: hs_err_pid21620.log My computer has 16 GB RAM. I wonder how many RAM the program required or how to improve the effiency since the data is not very large?

3) Besides, when I run generateCompounds() command, sometimes a error occured:

`Error in generateCompounds(fGroups, mslists, "metfrag", dbRelMzDev = 5, : 1 assertions failed:

This error disappeared after restarting the computer or restarting the Rstudio. But it always occures ramdomly. Is this a bug of the program and how can I fix it?

Thanks for your kind help!

rickhelmus commented 1 year ago

Hello,

Sorry for my late response, I am still catching up after a leave.

First of all, I am quite surprised about the large size of your mzML file. Could it be that the data is not centroided? How many feature groups do you get? (Or do you mean the 3GB is the sum for multiple files?)

For the formulas: unfortunately, GenForm can have troubles for features with larger masses as excessively large numbers (millions) of candidates will be calculated. Since patRoon processes features from low to high m/z, you see this 'blockage' later during the calculations. So far, you can try to improve the situation by e.g. removing high m/z features (assuming you are not interested in them), limiting the number of elements further (again if your study allows it), or reducing the the max number of candidates (maxCandidates parameter, see https://rickhelmus.github.io/patRoon/reference/generateFormulasGenForm.html).

The compounds: it seems MetFrag is running out of memory. If your data is not centroided, your peak lists will be huge, perhaps that could be the reason. You could also see if it works better by reducing the maximum number of parallel processes (see https://rickhelmus.github.io/patRoon/handbook_bd/parallelization.html#classic-multiprocessing-interface). In general, 16GB is enough, but maybe not if many parallel processes are running at the same time.

The error about the featureGroups class is strange... just guessing, but can it be that you are working in a restored (RStudio) session and you didn't run library(patRoon)? Does reinstalling patRoon help?

rickhelmus commented 1 year ago

Closing this due to inactivity.