Nesvilab / FragPipe

A cross-platform proteomics data analysis suite
http://fragpipe.nesvilab.org
Other
205 stars 38 forks source link

PTMProphet Syntax error parsing DIA-Umpire pseudo MS/MS mzML #611

Closed singjc closed 2 years ago

singjc commented 2 years ago

Hello FragPipe Team,

I am using FragPipe to process DIA-Umpire pseudo MS/MS mzML files (also generated from FragPipe - DIA-Umpire), which ends up failing on the PTMProphet validation step with the following error message

/home/roestlab/Ext_Programs/fragpipe-17.1/tools/philosopher/philosopher ptmprophet --keepold --static --em 1 --nions b --mods STY:79.966331,M:15.9949 --minprob 0.5 --maxthreads 1 interact-yanliu_I170114_008_PhosNoco2_SW_Q1.pep.xml
time="00:01:48" level=info msg="Executing PTMProphet  v4.1.1"
Process 'PtmProphet' finished, exit code: 1
Process returned non-zero exit code, stopping
/media/roestlab/Data1/User/JustinS/phospho_enriched_u2os/20220218_fragpipe_dia_pseudo/diau/rerun_diaumpire/phospho_enriched_rerun_diaumpire/yanliu_I170114_008_PhosNoco2_SW_Q1.mzML(1) : parseOffset() 4: Syntax error parsing XML.
time="00:01:49" level=fatal msg="Cannot execute program. there was an error with PTMProphet, please check your parameters and input files"

MSFragger and PeptideProphet seemed to have run successfully. I came across this similar issue Nesvilab/philosopher/issues/110, however, I didn't run MSFragger with any calibration or optimization, so the pepXML file doesn't have mzBIN_calibrated.

The DIA-Umpire pseudo MS/MS spectra I generate using FragPipe using the highest sensitivity settings (with mass defect filter for phospho searches). I have also generated DIA-Umpire pseudo MS/MS spectra using the default settings (with mass defect filter), which successfully ran through the MSFragger and Validation pipelines. So I'm not sure if there is some difference in how the DIA-Umpire pseudo MS/MS spectra are generated when using different sensitivity levels, or maybe the highest sensitivity settings is not suitable for the data I am using PXD006056: AB Sciex 6600 TripleTOF Phospho Enriched U2OS Samples?

DIA-Umpire Pseudo MS/MS Spectra Generation Log File

diaumpire_log_2022-02-27_10-29-10.txt

MSFragger-Valdiation Log File

msfragger_validation_log_2022-02-28_00-01-49.txt

If you need one of the pseudo MS/MS mzML files, I can send it, but it's ~3-5Gb.

Thank you,

Justin

anesvi commented 2 years ago

I recall that the highest sensitivity setting (min scan=2 parameter is the main thing) with 6600 data sometime takes a lot of time for DIA-Umpire and generates a lot of scans. It works fine with Thermo. I am not sure what the issue is. Maybe files too big for PTM-Prophet. I suggest you use default (min scan 1) for now. Could be hard for us to debug at the moment, especially for older (6600 data) is is less a priority. Best, Alexey

From: Justin Sing @.> Sent: Monday, February 28, 2022 12:49 AM To: Nesvilab/FragPipe @.> Cc: Subscribed @.***> Subject: [Nesvilab/FragPipe] PTMProphet Syntax error parsing DIA-Umpire pseudo MS/MS mzML (Issue #611)

External Email - Use Caution

Hello FragPipe Team,

I am using FragPipe to process DIA-Umpire pseudo MS/MS mzML files (also generated from FragPipe - DIA-Umpire), which ends up failing on the PTMProphet validation step with the following error message

/home/roestlab/Ext_Programs/fragpipe-17.1/tools/philosopher/philosopher ptmprophet --keepold --static --em 1 --nions b --mods STY:79.966331,M:15.9949 --minprob 0.5 --maxthreads 1 interact-yanliu_I170114_008_PhosNoco2_SW_Q1.pep.xml

time="00:01:48" level=info msg="Executing PTMProphet v4.1.1"

Process 'PtmProphet' finished, exit code: 1

Process returned non-zero exit code, stopping

/media/roestlab/Data1/User/JustinS/phospho_enriched_u2os/20220218_fragpipe_dia_pseudo/diau/rerun_diaumpire/phospho_enriched_rerun_diaumpire/yanliu_I170114_008_PhosNoco2_SW_Q1.mzML(1) : parseOffset() 4: Syntax error parsing XML.

time="00:01:49" level=fatal msg="Cannot execute program. there was an error with PTMProphet, please check your parameters and input files"

MSFragger and PeptideProphet seemed to have run successfully. I came across this similar issue Nesvilab/philosopher/issues/110https://github.com/Nesvilab/philosopher/issues/110, however, I didn't run MSFragger with any calibration or optimization, so the pepXML file doesn't have mzBIN_calibrated.

The DIA-Umpire pseudo MS/MS spectra I generate using FragPipe using the highest sensitivity settings (with mass defect filter for phospho searches). I have also generated DIA-Umpire pseudo MS/MS spectra using the default settings (with mass defect filter), which successfully ran through the MSFragger and Validation pipelines. So I'm not sure if there is some difference in how the DIA-Umpire pseudo MS/MS spectra are generated when using different sensitivity levels, or maybe the highest sensitivity settings is not suitable for the data I am using PXD006056: AB Sciex 6600 TripleTOF Phospho Enriched U2OS Sampleshttps://www.ebi.ac.uk/pride/archive/projects/PXD006056?

DIA-Umpire Pseudo MS/MS Spectra Generation Log File

diaumpire_log_2022-02-27_10-29-10.txthttps://github.com/Nesvilab/FragPipe/files/8150838/diaumpire_log_2022-02-27_10-29-10.txt

MSFragger-Valdiation Log File

msfragger_validation_log_2022-02-28_00-01-49.txthttps://github.com/Nesvilab/FragPipe/files/8150844/msfragger_validation_log_2022-02-28_00-01-49.txt

If you need one of the pseudo MS/MS mzML files, I can send it, but it's ~3-5Gb.

Thank you,

Justin

— Reply to this email directly, view it on GitHubhttps://github.com/Nesvilab/FragPipe/issues/611, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIIMM6ZCPDAVYIMAYB6QQZLU5MEDPANCNFSM5PQEE3DQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you are subscribed to this thread.Message ID: @.**@.>>


Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

singjc commented 2 years ago

Dear Alexey,

Thank you for your reply. You're right, it definitely takes a while, ~6 hours and generates ~5Gb pseudo MS/MS files, so it may be too big as you mention. I will stick to min scan =1 for now, and see how that goes.

I did have another question regarding quantification. Once I generate the pseudo MS/MS spectra, and perform the searching and validation. I would like to perform quantification using IonQuant afterwards, however, the MS1 precursor peak is not written to the pseudo MS/MS spectra files. I know you can export the precursor peak to a tsv file and then visualize it in BatMass, but is there anyway to export the precursor peak to the pseudo MS/MS spectra files, to allow the use of IonQuant? Is this possible?

Best,

Justin

anesvi commented 2 years ago

You cannot use IonQuant now for this. I wonder why you want to use it? I suggest you just use our full SpecLib workflow that has DIA-NN. FragPipe builds the library and passes to DIA-NN for quant (which reports MS1 intensity too in addition to fragment based). DIA-NN works well Alexey

From: Justin Sing @.> Sent: Monday, February 28, 2022 2:39 PM To: Nesvilab/FragPipe @.> Cc: Nesvizhskii, Alexey @.>; Comment @.> Subject: Re: [Nesvilab/FragPipe] PTMProphet Syntax error parsing DIA-Umpire pseudo MS/MS mzML (Issue #611)

External Email - Use Caution

Dear Alexey,

Thank you for your reply. You're right, it definitely takes a while, ~6 hours and generates ~5Gb pseudo MS/MS files, so it may be too big as you mention. I will stick to min scan =1 for now, and see how that goes.

I did have another question regarding quantification. Once I generate the pseudo MS/MS spectra, and perform the searching and validation. I would like to perform quantification using IonQuant afterwards, however, the MS1 precursor peak is not written to the pseudo MS/MS spectra files. I know you can export the precursor peak to a tsv file and then visualize it in BatMass, but is there anyway to export the precursor peak to the pseudo MS/MS spectra files, to allow the use of IonQuant? Is this possible?

Best,

Justin

— Reply to this email directly, view it on GitHubhttps://github.com/Nesvilab/FragPipe/issues/611#issuecomment-1054596662, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIIMM62LTPG4I75U572TGTLU5PFNHANCNFSM5PQEE3DQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you commented.Message ID: @.**@.>>


Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

singjc commented 2 years ago

I was interested in doing it this way to perform a comparison between DDA and DIA Pseudo MS/MS spectra, specifically for the purpose of comparing the two modes of acquisition using the same software tools and the same spectrum-centric analysis method. Thank you for the suggestion of DIA-NN, I could use it to get the MS1 and MS2 intensities I guess. I guess I could alternatively just compare the PSM peak library intensities that EasyPQP extracts during library generation for this purpose.

Best,

Justin