Nesvilab / FragPipe

A cross-platform Graphical User Interface (GUI) for running MSFragger and Philosopher - powered pipeline for comprehensive analysis of shotgun proteomics data
http://fragpipe.nesvilab.org
Other
179 stars 37 forks source link

Out of Memory during PASEF OpenSearch at Crystal-C #184

Closed chscho closed 4 years ago

chscho commented 4 years ago

Dear FragPipe-Team,

I'm currently trying to process PASEF-data (converted to mzML with MSconvert with activated "combine ion mobility" option) with OpenSearch parameters (default settings). I'm using a Xubuntu 18.04.4 LTS machine with 32 cores (x86_64) with 200GB RAM. After successful MSFragger search (pepXML and corresponding tsv files are generated) FragPipe crashes at the Crystal-C step.

Crystal-C [Work dir: /home/bioinf/bioinf_archive/49_scch_storage/Projects/Mtb_Proteogenomics/01_raw_data/TEST_PASEF_fragger/PASEFtest9_mzml_unfiltered_0072]
java -Dbatmass.io.libs.thermo.dir="/home/bioinf/bioinf_data/49_scch/Tools/msfragger/MSFragger-2.3/ext/thermo" -cp /tmp/fragpipe/batmass-io-1.17.2.jar:/tmp/fragpipe/grppr-0.3.23.jar:/tmp/fragpipe/original-crystalc-1.1.0.jar crystalc.Run /home/bioinf/bioinf_archive/49_scch_storage/Projects/Mtb_Proteogenomics/01_raw_data/TEST_PASEF_fragger/PASEFtest9_mzml_unfiltered_0072/crystalc-0-collinsb_T190411_Tbx007_P0072_1_ddaPASEF_1472.pepXML.params /home/bioinf/bioinf_archive/49_scch_storage/Projects/Mtb_Proteogenomics/01_raw_data/TEST_PASEF_fragger/PASEFtest9_mzml_unfiltered_0072/collinsb_T190411_Tbx007_P0072_1_ddaPASEF_1472.pepXML
Exception in thread "main" umich.ms.fileio.exceptions.FileParsingException: java.util.concurrent.ExecutionException: umich.ms.fileio.exceptions.FileParsingException: Could not allocate arrays during spectra decoding step
    at umich.ms.fileio.filetypes.xmlbased.AbstractXMLBasedDataSource.parse(AbstractXMLBasedDataSource.java:198)
    at umich.ms.datatypes.scancollection.impl.ScanCollectionDefault.loadData(ScanCollectionDefault.java:804)
    at umich.ms.datatypes.scancollection.impl.ScanCollectionDefault.loadData(ScanCollectionDefault.java:788)
    at crystalc.p_ReadData.LoadRawFile(p_ReadData.java:171)
    at crystalc.Run.ProcessPepXML(Run.java:69)
    at crystalc.Run.main(Run.java:51)
Caused by: java.util.concurrent.ExecutionException: umich.ms.fileio.exceptions.FileParsingException: Could not allocate arrays during spectra decoding step
    at java.util.concurrent.FutureTask.report(FutureTask.java:122)
    at java.util.concurrent.FutureTask.get(FutureTask.java:206)
    at umich.ms.fileio.filetypes.xmlbased.AbstractXMLBasedDataSource.parse(AbstractXMLBasedDataSource.java:191)
    ... 5 more
Caused by: umich.ms.fileio.exceptions.FileParsingException: Could not allocate arrays during spectra decoding step
    at umich.ms.fileio.filetypes.mzml.MZMLPeaksDecoder.decode(MZMLPeaksDecoder.java:211)
    at umich.ms.fileio.filetypes.mzml.MZMLMultiSpectraParser.tagBinaryDataListStart(MZMLMultiSpectraParser.java:481)
    at umich.ms.fileio.filetypes.mzml.MZMLMultiSpectraParser.call(MZMLMultiSpectraParser.java:163)
    at umich.ms.fileio.filetypes.mzml.MZMLMultiSpectraParser.call(MZMLMultiSpectraParser.java:62)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.OutOfMemoryError: Java heap space
Process 'Crystal-C' finished, exit code: 1

Process returned non-zero exit code, stopping
Crystal-C [Work dir: /home/bioinf/bioinf_archive/49_scch_storage/Projects/Mtb_Proteogenomics/01_raw_data/TEST_PASEF_fragger/PASEFtest9_mzml_unfiltered_0072]
java -Dbatmass.io.libs.thermo.dir="/home/bioinf/bioinf_data/49_scch/Tools/msfragger/MSFragger-2.3/ext/thermo" -cp /tmp/fragpipe/batmass-io-1.17.2.jar:/tmp/fragpipe/grppr-0.3.23.jar:/tmp/fragpipe/original-crystalc-1.1.0.jar crystalc.Run /home/bioinf/bioinf_archive/49_scch_storage/Projects/Mtb_Proteogenomics/01_raw_data/TEST_PASEF_fragger/PASEFtest9_mzml_unfiltered_0072/crystalc-1-collinsb_T190411_Tbx007_P0072_5_ddaPASEF_1484.pepXML.params /home/bioinf/bioinf_archive/49_scch_storage/Projects/Mtb_Proteogenomics/01_raw_data/TEST_PASEF_fragger/PASEFtest9_mzml_unfiltered_0072/collinsb_T190411_Tbx007_P0072_5_ddaPASEF_1484.pepXML

~~~~~~~~~~~~~~~~~~~~
Cancelling 9 remaining tasks
Processing interrupted, stopping Crystal-C

I guess, there is a problem with the allocated heap size, because the RAM usage I've observed during the process wasn't higher than 120GB. I've tried to run the Crystal-C command now on the command line with all available RAM allocated (-Xmx181G) and it seems to run smoothly.

Did I do something wrong during the setup of the FragPipe run, or is this this missing parameter during Crystal-C invocation a FragPipe bug?

By the way, when I've used PASEF data converted to mzXML with the "threshold peak filter" option set to 150, this problem did not occure... Related to this: I've just read your new paper on Biorxiv and now I'm wondering where you see the optimal "peak threshold" parameter? Is it 100 (as default by FragPipe) or 150 (as in your paper) or are reasons to go higher / lower?

Thanks again for your help. Best, Christian

fcyu commented 4 years ago

Hi Christian,

Looks like there is no -Xmx flag for Crystal-C. @chhh Can you help to add it to FragPipe?

The optimial peak threshold is decided during the parameter optimization step in MSFragger. You can see a table in the consol after mass calibration.

Best,

Fengchao

anesvi commented 4 years ago

Do we even support crystal-c for TimsTOF when using .d?

Sent from my iPhone

On Apr 3, 2020, at 8:44 AM, Fengchao notifications@github.com wrote:

 External Email - Use Caution

Hi Christian,

Looks like there is no -Xmx flag for Crystal-C. @chhhhttps://github.com/chhh Can you help to add it to FragPipe?

The optimial peak threshold is decided during the parameter optimization step in MSFragger. You can see a table in the consol after mass calibration.

Best,

Fengchao

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/Nesvilab/FragPipe/issues/184#issuecomment-608412823, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIIMM66LS55MKVT3YLHO5QDRKXK4PANCNFSM4L3VV4GQ.


Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

fcyu commented 4 years ago

Hi @anesvi

He/She is using mzML converted from .d. I think Hui-Yin test a little bit using a mzML from timsTOF months ago.

Best,

Fengchao

huiyinc commented 4 years ago

Hi,

Yes, I think it's good to add -Xmx (as Fengchao suggests) to FragPipe. Thanks.

Huiyin

從我的iPhone傳送

Fengchao notifications@github.com 於 2020年4月3日 上午8:51 寫道:



Hi @anesvi https://github.com/anesvi

He/She is using mzML converted from .d. I think Hui-Yin test a little bit using a mzML from timsTOF months ago.

Best,

Fengchao

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Nesvilab/FragPipe/issues/184#issuecomment-608415700, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALAWWA6UJQQDP6NPPIDPH63RKXLVLANCNFSM4L3VV4GQ .

chhh commented 4 years ago

I don't remember why it wasn't there in the first place, but I'll add it to the next release

On Fri, Apr 3, 2020 at 6:55 AM chuiyin notifications@github.com wrote:

Hi,

Yes, I think it's good to add -Xmx (as Fengchao suggests) to FragPipe. Thanks.

Huiyin

從我的iPhone傳送

Fengchao notifications@github.com 於 2020年4月3日 上午8:51 寫道:



Hi @anesvi https://github.com/anesvi

He/She is using mzML converted from .d. I think Hui-Yin test a little bit using a mzML from timsTOF months ago.

Best,

Fengchao

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Nesvilab/FragPipe/issues/184#issuecomment-608415700, or unsubscribe < https://github.com/notifications/unsubscribe-auth/ALAWWA6UJQQDP6NPPIDPH63RKXLVLANCNFSM4L3VV4GQ

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Nesvilab/FragPipe/issues/184#issuecomment-608447315, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAA2K7WIUDTPW6XYH76AW5LRKXTGZANCNFSM4L3VV4GQ .

fcyu commented 4 years ago

Fixed.