Nesvilab / MSFragger

Ultrafast, comprehensive peptide identification for mass spectrometry–based proteomics
https://msfragger.nesvilab.org
103 stars 7 forks source link

openSearch issue: MSFragger mass calibration + peptideProphet #50

Closed adithirv closed 4 years ago

adithirv commented 4 years ago

Hi MSFragger team,

I am running MSFragger open search using MSFragger-2.2 and philosopher build 20190319. I am getting this message in the fragger run: "Not enough data to perform mass calibration, using the uncalibrated data". Subsequently, the mixture model quality test is failing for all charges in the peptide prophet step. Utimately, the protein prophet does not find any peptide prophet results and fails completely. Would you know what is causing this problem? My proteomics data were generated using LTQ Orbitrap XL or LTQ FT Ultra mass spectrometer. I alternatively tried open search using data generated from Q Exactive and this problem was not there. I am attaching my fragger.params and log file.

fragger_params.txt

log-fragpipe-run-at_2019-11-28_13-40-32.log

Thanks in advance Adithi

fcyu commented 4 years ago

Hi Adithi,

Thanks for your interest in MSFragger. Is your data from high resolution MS/MS?

Thanks,

Fengchao

On Thu, 28 Nov 2019 at 8:10 AM, adithirv notifications@github.com wrote:

Hi MSFragger team,

I am running MSFragger open search using MSFragger-2.2 and philosopher build 20190319. I am getting this message in the fragger run: "Not enough data to perform mass calibration, using the uncalibrated data". Subsequently, the mixture model quality test is failing for all charges in the peptide prophet step. Utimately, the protein prophet does not find any peptide prophet results and fails completely. Would you know what is causing this problem? My proteomics data were generated using LTQ Orbitrap XL or LTQ FT Ultra mass spectrometer. I alternatively tried open search using data generated from Q Exactive and this problem was not there. I am attaching my fragger.params and log file.

fragger_params.txt https://github.com/Nesvilab/MSFragger/files/3901813/fragger_params.txt

log-fragpipe-run-at_2019-11-28_13-40-32.log https://github.com/Nesvilab/MSFragger/files/3901824/log-fragpipe-run-at_2019-11-28_13-40-32.log

Thanks in advance Adithi

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Nesvilab/MSFragger/issues/50?email_source=notifications&email_token=ABU27W3365R7DST3G5PVUN3QV67MDA5CNFSM4JSUJYWKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4H4VWGDQ, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABU27W4UCFZTHSN5RSO6EN3QV67MDANCNFSM4JSUJYWA .

-- Dr. Fengchao Yu University of Michigan

adithirv commented 4 years ago

Hi Fengchao,

Thanks for your reply. The data were generated from either LTQ Orbitrap XL or LTQ FT. So, I would say they are high resolution MS/MS.

Best Adithi

fcyu commented 4 years ago

Hi Adithi,

Could you please try the following for me:

  1. Set the fragment_mass_tolerance to 300 PPM and re-run MSFragger.
  2. Open your mzML with an editor and look for the tag with accession MS:1000512 from a MS/MS scan. Your file may not contain this tag, so it would be better try the 1st one first.

Thanks,

Fengchao

anesvi commented 4 years ago

No, Most likely low resolution ms/ms

Sent from my iPhone

On Nov 28, 2019, at 10:21 AM, adithirv notifications@github.com wrote:

 External Email - Use Caution

Hi Fengchao,

Thanks for your reply. The data were generated from either LTQ Orbitrap XL or LTQ FT. So, I would say they are high resolution MS/MS.

Best Adithi

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/Nesvilab/MSFragger/issues/50?email_source=notifications&email_token=AIIMM65PDRSBL7LD45X2CCTQV7OYBA5CNFSM4JSUJYWKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFM4Q6Y#issuecomment-559532155, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIIMM66YLJVWL45TWE36FR3QV7OYBANCNFSM4JSUJYWA.


Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

adithirv commented 4 years ago

Hi Fengchao,

I tried the first solution that you provided me. It is not not showing any mass calibration errors. But, I run into memory issues now. I will try to fix it and keep you posted if it runs through completely.

@Alexey: Ok. Apologies then. I looked up online and asked mass spec experts, both gave me the conclusion that it is high resolution. Does that influence the mass calibration error I am getting?

Best Adithi

anesvi commented 4 years ago

If you get good Id numbers with 300ppm initial fragment setting ( and bad with 20ppm) it means you have low mass accuracy data

But calibration should be fine either way, I think

With low mass accuracy ms/ms, we do not recommend open search though

Alexey

Sent from my iPhone

On Nov 29, 2019, at 2:52 AM, adithirv notifications@github.com wrote:

 External Email - Use Caution

Hi Fengchao,

I tried the first solution that you provided me. It is not not showing any mass calibration errors. But, I run into memory issues now. I will try to fix it and keep you posted if it runs through completely.

@alexeyhttps://github.com/alexey: Ok. Apologies then. I looked up online and asked mass spec experts, both gave me the conclusion that it is high resolution. Does that influence the mass calibration error I am getting?

Best Adithi

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/Nesvilab/MSFragger/issues/50?email_source=notifications&email_token=AIIMM67SPM6BJL3PNZWDG6LQWDC5JA5CNFSM4JSUJYWKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFOEGPI#issuecomment-559694653, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIIMM6ZK74JGCPNFX4O5ZY3QWDC5JANCNFSM4JSUJYWA.


Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

fcyu commented 4 years ago

Hi Adithi,

I agree with Alexey that your data has low resolution MS/MS spectra. You may confirm it by looking for the tag with MS:1000512 in your mzML file.

And mass calibration works well with both high and low resolution data as long as you put an appropriate fragment tolerance.

Best,

Fengchao

adithirv commented 4 years ago

Hi Fengchao and Alexey,

Thanks again for your detailed reply and suggestions. You guys provide great and fastest support which makes MSFragger more enjoyable to use.

My data now ran through successfully in both open and closed search with 300 ppm fragment tolerance. I had the mass calibration and downstream error in both the search mode but now they both are solved. Sure, fragment tolerance had to be adjusted. I have previously used MS-GF+ with the same data and a precursor tolerance of 20ppm. With MS-GF+, the fragment tolerance cannot be set. So, I was not aware of a frag. tol. setting that would work. Next, I am going to try closed search with proteogenomics database and looking forward to see how MSFragger performs there.

I checked for the MS:1000512 tag in my mzML files. Indeed, my file has this tag for MS/MS scan.

Thanks once again, Best Adithi

jossmi commented 3 years ago

Hi MSFragger team,

I am also running into this same issue when doing an open search: I am getting an error of "Not enough data to perform mass calibration. Using the uncalibrated data.". I had tried it at 15 and 20 ppm for the fragment tolerance, but running it at 300 ppm fragment tolerance did not change the error message.

I am running this on a Fusion Lumos, with MS1 Orbitrap resolution of 30,000 and MS2 Orbitrap resolution of 30,000. This seems well within the typical definition of high resolution, but is it not high enough for the mass calibration in MSFragger?

I am running MSFragger 3.1.1 and Philospher 3.3.12 on FragPipe 14.0. I've attached my log and parameters from the 15 ppm run. fragger_params.txt log_2020-12-08_00-25-12.txt

Thanks, Josh

fcyu commented 3 years ago

Hi Jsoh,

Does your data have some labelling or PTM enrichment?

Best,

Fengchao

jossmi commented 3 years ago

Fengchao,

Thanks for the speedy reply! No, no labeling or enrichment. No reduction or alkylation either (unusual, I know, but that on purpose for our assay). Its an overlapping-window DIA run on trypsin-LysC digest of human serum. The goal is to identify unknown modifications. I've specified some Cys mods as variable modifications that we know could be present.

Josh

fcyu commented 3 years ago

Open search is for DDA only.

anesvi commented 3 years ago

You can run DIA-Umpire workflow or MSFragger-DIA (if narrow window DIA). You could still do open search in that case.

Is your database a regular database? The file name suggest that it is a custom database?

Alexey

From: jossmi notifications@github.com Sent: Wednesday, January 13, 2021 2:32 PM To: Nesvilab/MSFragger MSFragger@noreply.github.com Cc: Nesvizhskii, Alexey nesvi@med.umich.edu; Comment comment@noreply.github.com Subject: Re: [Nesvilab/MSFragger] openSearch issue: MSFragger mass calibration + peptideProphet (#50)

External Email - Use Caution

Fengchao,

Thanks for the speedy reply! No, no labeling or enrichment. No reduction or alkylation either (unusual, I know, but that on purpose for our assay). Its an overlapping-window DIA run on trypsin-LysC digest of human serum. The goal is to identify unknown modifications. I've specified some Cys mods as variable modifications that we know could be present.

Josh

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/Nesvilab/MSFragger/issues/50#issuecomment-759676485, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIIMM6YN6P6TFEEJKGVS6CLSZXYMXANCNFSM4JSUJYWA.


Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

jossmi commented 3 years ago

Fengchao and Alexey,

...I did not know that. Now I feel pretty foolish for having spent so long trying to get this to work with my DIA data. Is that specified anywhere? I might have missed that detail in the papers or documentation.

We are actually only interested in albumin modifications. I've tried it both with the whole human database and with a custom database I generated with Philosopher, using only human albumin, decoys, and contaminants.

So, to clarify, DIA-Umpire can do open searching of my DIA data and identify unknown modifications? What would the pipeline look like in that case - would I still run Crystal-C, PeptideProphet, and ProteinProphet downstream from DIA-Umpire, similar to how I was attempting with MSFragger?

Thanks, Josh

fcyu commented 3 years ago

You need to check 'Enable DIA-Umpire` image

Then, load 'DIA-Umpire_SpecLib` workflow: image

Then, change the MSFragger and PeptideProphet setting to open search. You need to uncheck Crystal-C because it will crash with DIA-Umpire's outputted mzML, I think.

Best,

Fengchao

anesvi commented 3 years ago

Dear Josh, Our tutorials are here: https://fragpipe.nesvilab.org/

We do not discuss DIA data there. But as Fengchao said, you can use DIA-Umpire workflow

Best Alexey

From: jossmi notifications@github.com Sent: Wednesday, January 13, 2021 3:01 PM To: Nesvilab/MSFragger MSFragger@noreply.github.com Cc: Nesvizhskii, Alexey nesvi@med.umich.edu; Comment comment@noreply.github.com Subject: Re: [Nesvilab/MSFragger] openSearch issue: MSFragger mass calibration + peptideProphet (#50)

External Email - Use Caution

Fengchao and Alexey,

...I did not know that. Now I feel pretty foolish for having spent so long trying to get this to work with my DIA data. Is that specified anywhere? I might have missed that detail in the papers or documentation.

We are actually only interested in albumin modifications. I've tried it both with the whole human database and with a custom database I generated with Philosopher, using only human albumin, decoys, and contaminants.

So, to clarify, DIA-Umpire can do open searching of my DIA data and identify unknown modifications? What would the pipeline look like in that case - would I still run Crystal-C, PeptideProphet, and ProteinProphet downstream from DIA-Umpire, similar to how I was attempting with MSFragger?

Thanks, Josh

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/Nesvilab/MSFragger/issues/50#issuecomment-759706335, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIIMM6YK7FLZIOKRIMFAM43SZX3X3ANCNFSM4JSUJYWA.


Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

jossmi commented 3 years ago

Thanks so much for your help.

Josh