Nesvilab / FragPipe

A cross-platform proteomics data analysis suite
http://fragpipe.nesvilab.org
Other
205 stars 38 forks source link

Did not find any PeptideProphet results in input data! #152

Closed JoshEbner closed 4 years ago

JoshEbner commented 5 years ago

Hello everyone,

After ProteinProphet starts I get the following warning: no data - output file will be empty (but see the log file). The remaining tasks are then stopped.

I was wondering what I did wrong there, the interact.pep.xml file is clearly in the output directory and not empty. What is going on?

Best regards, Joshua


log_2019-11-20_17-48-23.txt

fcyu commented 5 years ago

Hi Joshua,

Thanks for your interest in FragPipe. Your log shows that there is not much PSMs in your data:

*********************************MASS CALIBRATION**********************************
-----|---------------|---------------|---------------|---------------
     |  MS1   (Old)  |  MS1   (New)  |  MS2   (Old)  |  MS2   (New)  
-----|---------------|---------------|---------------|---------------
 Run |  Median  MAD  |  Median  MAD  |  Median  MAD  |  Median  MAD  
 001 | Not enough data to perform mass calibration. Using the uncalibrated data.
-----|---------------|---------------|---------------|---------------
Finding the optimal parameters:
-------|-------|-------|-------|-------|-------|-------
  MS2  |    7  |   10  |   15  |   20  |   25  |   30  
-------|-------|-------|-------|-------|-------|-------
 Count |     17|     40|     24| skip rest
-------|-------|-------|-------|-------|-------|-------
-------|-------|-------|-------|-------|-------
 Peaks | 200_0 | 175_0 | 150_1 | 125_1 | 100_1 
-------|-------|-------|-------|-------|-------
 Count |     40|     40|     39| skip rest
-------|-------|-------|-------|-------|-------

Could you please double check if your parameters meet the properties of your data?

Best,

Fengchao

JoshEbner commented 5 years ago

Hi Fengchao,

Thank you for your reply. Yes, I noticed the low PSMs and was confused since a MaxQuant narrow search worked quite well. I am a novice in Proteomics/FragPipe as I have a background in ecology. Could you recommend me the correct parameters for conducting an open search for the following type data?: Orbitrap Fusion Lumos Tribrid MS (.raw file was converted to .mzML via MSConvert). A detailled information on how the data was obtained is here: https://onlinelibrary.wiley.com/doi/full/10.1111/mec.15225

Sorry if i am being naive here but I hope you can help me out.

Best regards,

Joshua

anesvi commented 5 years ago

I see you search against a custom database database_name = C:\Users\Wotuli32\Desktop\FragPipe_Trial\2019-11-20-td-OgDB_plus_Rf_Clean_CdHit.fasta

did you format it properly? Does your decoy start with rev_?

https://github.com/Nesvilab/philosopher/wiki/How-to-Prepare-a-Protein-Database

From: JoshEbner notifications@github.com Sent: Wednesday, November 20, 2019 12:43 PM To: Nesvilab/FragPipe FragPipe@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: Re: [Nesvilab/FragPipe] Did not find any PeptideProphet results in input data! (#152)

External Email - Use Caution

Hi Fengchao,

Thank you for your reply. Yes, I noticed the low PSMs and was confused since a MaxQuant narrow search worked quite well. I am a novice in Proteomics/FragPipe as I have a background in ecology. Could you recommend me the correct parameters for conducting an open search for the following type data?: Orbitrap Fusion Lumos Tribrid MS (.raw file was converted to .mzML via MSConvert). A detailled information on how the data was obtained is here: https://onlinelibrary.wiley.com/doi/full/10.1111/mec.15225

Sorry if i am being naive here but I hope you can help me out.

Best regards,

Joshua

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/Nesvilab/FragPipe/issues/152?email_source=notifications&email_token=AIIMM65HSVCBV2D53YGHZMLQUVZLBA5CNFSM4JPWD732YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEETE6UI#issuecomment-556158801, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIIMM67VUFC76N7CIUEYGSLQUVZLBANCNFSM4JPWD73Q.


Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

fcyu commented 5 years ago

Hi Joshua,

As what Alexey pointed out, your fasta file most likely to be the reason of the error. Could you please send your fasta file to us?

If possible, you can also send your mzML file to us.

Thanks,

Fengchao

prvst commented 5 years ago

It seems to be the search parameters, not the database.

JoshEbner commented 5 years ago

Hi everyone,

Like Felipe, I also suspect that I have the wrong search parameters. I followed the database guidelines in the link you attached Alexey: "philosopher database --custom "file.fasta" --contam".

Here is a link to the database and the mzML file (hope this works): https://gofile.io/?c=q1UWWL

And here is a screenshot of the fasta file: Screenshot

Thank you for all the input, Josh

fcyu commented 5 years ago

Hi Josh,

Looks like your data is from low resolution MS2. There is a <cvParam cvRef="MS" accession="MS:1000512" name="filter string" value="ITMS + c NSI r d Full ms2 382.18@cid35.00 [95.00-775.00]"/> in your mzML indicates that the MS2 scan was from ion trap, while you set fragment_mass_tolerance = 20.0. I guess you need to change it to fragment_mass_tolerance = 300 and have another try.

BTW, your database is not well formulated. You need to follow the instruction in https://github.com/Nesvilab/philosopher/wiki/How-to-Prepare-a-Protein-Database

Best,

Fengchao

JoshEbner commented 5 years ago

Hi Fengchao,

Ok, thanks! I will retry the run tomorrow with fragment_mass_tolerance = 300 and re-formulate the database. I'll keep this post updated tomorrow if the "problem" got resolved.

Best, Josh

fcyu commented 5 years ago

Hi Josh,

Sounds good.

FYI, we do not recommend doing open search with low MS2 resolution data.

Best,

Fengchao

JoshEbner commented 4 years ago

Hi Fengchao,

first of all thanks for all the help! I know that open searching with low MS2 resolution data is not recommended but I will have high MS2 res data soon and wanted to test out the pipeline.

After reformatting the database and setting fragment_mass_tolerance = 300 I got everything up and running. The only problem is that I run into a memory issue with FragPipe slowing down massively after ca. 50% of spectra have been searched and then giving me the java.lang.outofmemoryerror java heap space error message. I used the Split database feature but then the "Localize delta mass" open search option has to be turned off. What exactly is Localize delta mass doing and is it vital to keep it checked during an open search?

Best, Josh

fcyu commented 4 years ago

Hi Josh,

localize_delta_mass tries to put delta mass to the residue(s) that result in the best match. Turning it off may reduce the sensitivity a little bit but won't hurt the whole result too much. If what you want was finding out what modifications your sample has, it would be totally fine without localize_delta_mass.

Best,

Fengchao