compomics / peptide-shaker

Interpretation of proteomics identification results
http://compomics.github.io/projects/peptide-shaker.html
46 stars 19 forks source link

Variable modifications from X!tandem in parameters file. #163

Closed PratikDJagtap closed 7 years ago

PratikDJagtap commented 8 years ago

While looking at the parameters file for one of the datasets that was searched with X!tandem, OMSSA, Myrimatch, Comet and MSGF+ the following output was generated.

'Variable Modifications: Oxidation of M, iTRAQ 4-plex of Y, Acetylation of protein N-term, Pyrolidone from E, Pyrolidone from Q, Pyrolidone from carbamidomethylated C'.

I had set Oxidation of M, iTRAQ 4-plex of Y as variable modifications. However, PeptideShaker seems to have picked up the rest from X!tandem searches.

The additional variable mods are automatically added to import results from XTandem:

https://github.com/compomics/peptide-shaker/blob/master/src/main/java/eu/isas/peptideshaker/fileimport/PsmImporter.java#L1214

These parameters are used exclusively by X!tandem (though parameters file does not indicate that). Though it can be argued that these are valid modifications - the report gives an impression that these modifications were used by all search algorithms. Is there a way to 'turn off' these additional modifications?

Please see the summary file here:

Project Details1: PeptideShaker Version: 1.10.0 2: Date: Tue Apr 19 15:59:44 CDT 2016 3: Experiment: Galaxy_Experiment_2016041915401461098445 4: Sample: Sample_2016041915401461098445 5: Replicate Number: 1 6: Identification Algorithms: OMSSA, X!Tandem, MS-GF+ and MyriMatch Database Search Parameters

1: Precursor Tolerance Unit: ppm 2: Precursor Ion m/z Tolerance: 10.0 3: Fragment Ion Tolerance Unit: Da 4: Fragment Ion m/z Tolerance: 0.1 5: Enzyme: Trypsin 6: Number of Missed Cleavages: Not implemented 7: Database: input_database.fasta 8: Forward Ion: b 9: Rewind Ion: y 10: Fixed Modifications: Carbamidomethylation of C, iTRAQ 4-plex of K, iTRAQ 4-plex of peptide N-term 11: Variable Modifications: Oxidation of M, iTRAQ 4-plex of Y, Acetylation of protein N-term, Pyrolidone from E, Pyrolidone from Q, Pyrolidone from carbamidomethylated C 12: Refinement Variable Modifications: 13: Refinement Fixed Modifications: Input Filters

1: Minimal Peptide Length: 6 2: Maximal Peptide Length: 50 3: Precursor m/z Tolerance: 10.0 4: Precursor m/z Tolerance Unit: Yes 5: Unrecognized Modifications Discarded: Yes Validation Summary

1: Proteins: #Validated: 3841.0 2: Proteins: Total Possible TP: 4778.72 3: Proteins: FDR Limit [%]: 0.99 4: Proteins: FNR Limit [%]: 21.04 5: Proteins: Confidence Limit [%]: 73.99 6: Proteins: PEP Limit [%]: 26.01 7: Proteins: Confidence Accuracy [%]: 0.03 8: Peptides (Unmodified): #Validated: 15926.0 9: Peptides (Oxidation of M): #Validated: 2017.0 10: Peptides (Other): #Validated: 410.0 11: Peptides (Unmodified): Total Possible TP: 18611.94 12: Peptides (Oxidation of M): Total Possible TP: 2570.84 13: Peptides (Other): Total Possible TP: 640.99 14: Peptides (Unmodified): FDR Limit [%]: 0.99 15: Peptides (Oxidation of M): FDR Limit [%]: 0.99 16: Peptides (Other): FDR Limit [%]: 0.98 17: Peptides (Unmodified): FNR Limit [%]: 15.17 18: Peptides (Oxidation of M): FNR Limit [%]: 22.27 19: Peptides (Other): FNR Limit [%]: 36.33 20: Peptides (Unmodified): Confidence Limit [%]: 85.17 21: Peptides (Oxidation of M): Confidence Limit [%]: 87.6 22: Peptides (Other): Confidence Limit [%]: 85.71 23: Peptides (Unmodified): PEP Limit [%]: 14.83 24: Peptides (Oxidation of M): PEP Limit [%]: 12.4 25: Peptides (Other): PEP Limit [%]: 14.29 26: Peptides (Unmodified): Confidence Accuracy [%]: 0.1 27: Peptides (Oxidation of M): Confidence Accuracy [%]: 0.9 28: Peptides (Other): Confidence Accuracy [%]: 2.78 29: PSMs (Charge 2 of file Mascot formatted MGF of data 6.mgf): #Validated PSM: 5604.0 30: PSMs (Charge 3 of file Mascot formatted MGF of data 3.mgf): #Validated PSM: 4722.0 31: PSMs (Charge 3 of file Mascot formatted MGF of data 4.mgf): #Validated PSM: 5250.0 32: PSMs (Charge 3 of file Mascot formatted MGF of data 7.mgf): #Validated PSM: 1171.0 33: PSMs (Other Charge 2): #Validated PSM: 16539.0 34: PSMs (Other Charge 3): #Validated PSM: 4730.0 35: PSMs (Charge 4 and Charge 5, 6): #Validated PSM: 817.0 36: PSMs (Charge 2 of file Mascot formatted MGF of data 6.mgf): Total Possible TP: 7018.57 37: PSMs (Charge 3 of file Mascot formatted MGF of data 3.mgf): Total Possible TP: 5062.2 38: PSMs (Charge 3 of file Mascot formatted MGF of data 4.mgf): Total Possible TP: 5456.61 39: PSMs (Charge 3 of file Mascot formatted MGF of data 7.mgf): Total Possible TP: 1400.4 40: PSMs (Other Charge 2): Total Possible TP: 18341.58 41: PSMs (Other Charge 3): Total Possible TP: 5328.02 42: PSMs (Charge 4 and Charge 5, 6): Total Possible TP: 872.55 43: PSMs (Charge 2 of file Mascot formatted MGF of data 6.mgf): FDR Limit [%]: 1.0 44: PSMs (Charge 3 of file Mascot formatted MGF of data 3.mgf): FDR Limit [%]: 1.0 45: PSMs (Charge 3 of file Mascot formatted MGF of data 4.mgf): FDR Limit [%]: 0.99 46: PSMs (Charge 3 of file Mascot formatted MGF of data 7.mgf): FDR Limit [%]: 0.94 47: PSMs (Other Charge 2): FDR Limit [%]: 1.0 48: PSMs (Other Charge 3): FDR Limit [%]: 0.99 49: PSMs (Charge 4 and Charge 5, 6): FDR Limit [%]: 0.98 50: PSMs (Charge 2 of file Mascot formatted MGF of data 6.mgf): FNR Limit [%]: 20.97 51: PSMs (Charge 3 of file Mascot formatted MGF of data 3.mgf): FNR Limit [%]: 7.64 52: PSMs (Charge 3 of file Mascot formatted MGF of data 4.mgf): FNR Limit [%]: 4.71 53: PSMs (Charge 3 of file Mascot formatted MGF of data 7.mgf): FNR Limit [%]: 17.4 54: PSMs (Other Charge 2): FNR Limit [%]: 10.71 55: PSMs (Other Charge 3): FNR Limit [%]: 12.14 56: PSMs (Charge 4 and Charge 5, 6): FNR Limit [%]: 7.4 57: PSMs (Charge 2 of file Mascot formatted MGF of data 6.mgf): Confidence Limit [%]: 89.4 58: PSMs (Charge 3 of file Mascot formatted MGF of data 3.mgf): Confidence Limit [%]: 78.52 59: PSMs (Charge 3 of file Mascot formatted MGF of data 4.mgf): Confidence Limit [%]: 53.85 60: PSMs (Charge 3 of file Mascot formatted MGF of data 7.mgf): Confidence Limit [%]: 89.4 61: PSMs (Other Charge 2): Confidence Limit [%]: 84.68 62: PSMs (Other Charge 3): Confidence Limit [%]: 79.62 63: PSMs (Charge 4 and Charge 5, 6): Confidence Limit [%]: 75.0 64: PSMs (Charge 2 of file Mascot formatted MGF of data 6.mgf): PEP Limit [%]: 10.6 65: PSMs (Charge 3 of file Mascot formatted MGF of data 3.mgf): PEP Limit [%]: 21.48 66: PSMs (Charge 3 of file Mascot formatted MGF of data 4.mgf): PEP Limit [%]: 46.15 67: PSMs (Charge 3 of file Mascot formatted MGF of data 7.mgf): PEP Limit [%]: 10.6 68: PSMs (Other Charge 2): PEP Limit [%]: 15.32 69: PSMs (Other Charge 3): PEP Limit [%]: 20.38 70: PSMs (Charge 4 and Charge 5, 6): PEP Limit [%]: 25.0 71: PSMs (Charge 2 of file Mascot formatted MGF of data 6.mgf): Confidence Accuracy [%]: 0.05 72: PSMs (Charge 3 of file Mascot formatted MGF of data 3.mgf): Confidence Accuracy [%]: 0.68 73: PSMs (Charge 3 of file Mascot formatted MGF of data 4.mgf): Confidence Accuracy [%]: 0.86 74: PSMs (Charge 3 of file Mascot formatted MGF of data 7.mgf): Confidence Accuracy [%]: 0.46 75: PSMs (Other Charge 2): Confidence Accuracy [%]: 0.82 76: PSMs (Other Charge 3): Confidence Accuracy [%]: 0.64 77: PSMs (Charge 4 and Charge 5, 6): Confidence Accuracy [%]: 4.35 PTM Scoring Settings

1: Probabilistic Score: PhosphoRS 2: Accounting for Neutral Losses: No 3: Threshold: 95.0 Spectrum Counting Parameters

1: Method: NSAF 2: Validated Matches Only: No Annotation Settings

1: Intensity Limit: 0.75 2: Automatic Annotation: Yes 3: Selected Ions: b, y 4: Neutral Losses: H2O, NH3, CH4OS 5: Neutral Losses Sequence Dependence: Yes 6: Fragment Ion m/z Tolerance: 0.1

ulrich-eckhard commented 8 years ago

sounds to me like you are using the "quick pyrrolidone" and "quick acetyl" options in X! Tandem. possible?

image

PratikDJagtap commented 8 years ago

Hello Ulrich,

I am using SearchGUI within GalaxyP. In that interface - I do not see the options. Can they be 'turned off" in the command-line version?

Thanks, Pratik

ulrich-eckhard commented 8 years ago

pretty sure you can, and marc and harald will know - but I unfortunately don't! :-)

On Wed, Apr 20, 2016 at 10:34 AM, Pratik Jagtap notifications@github.com wrote:

Hello Ulrich,

I am using SearchGUI within GalaxyP. In that interface - I do not see the options. Can they be 'turned off" in the command-line version?

Thanks, Pratik

— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/compomics/peptide-shaker/issues/163#issuecomment-212528263

PratikDJagtap commented 8 years ago

Thanks Ulrich,

I am copying JJ here - who might be able to add inputs.

Thanks, Pratik

Pratik Jagtap, Managing Director, Center for Mass Spectrometry and Proteomics, 43 Gortner Laboratory, 1479 Gortner Avenue, St. Paul, MN 55108 Phone: 612-624-9275 http://cbs.umn.edu/cmsp/

On Wed, Apr 20, 2016 at 12:40 PM, Ulrich Eckhard notifications@github.com wrote:

pretty sure you can, and marc and harald will know - but I unfortunately don't! :-)

On Wed, Apr 20, 2016 at 10:34 AM, Pratik Jagtap notifications@github.com wrote:

Hello Ulrich,

I am using SearchGUI within GalaxyP. In that interface - I do not see the options. Can they be 'turned off" in the command-line version?

Thanks, Pratik

— You are receiving this because you commented. Reply to this email directly or view it on GitHub < https://github.com/compomics/peptide-shaker/issues/163#issuecomment-212528263

— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/compomics/peptide-shaker/issues/163#issuecomment-212530623

mvaudel commented 8 years ago

Dear Pratik,

Ulrich is correct, these modifications are added by default by X!Tandem because highly prevalent in proteomic datasets. You can turn them off in command line using the X!Tandem advanced settings: -xtandem_quick_acetyl and -xtandem_quick_pyro

You will find more details here: https://github.com/compomics/compomics-utilities/wiki/IdentificationParametersCLI#xtandem-advanced-parameters

We have refactored all parameters to be more centralized, to avoid the need for an extra command line, and to use an open format (json). I will see if I can make this clearer in the COA.

Hope this helps!

Marc

PratikDJagtap commented 8 years ago

Thanks Marc,

I will request JJ to make that change.

Regards, Pratik

jj-umn commented 8 years ago

The options -xtandem_quick_acetyl and -xtandem_quick_pyro are already in the xtandem advanced settings of the Galaxy tool wrapper for SearchGUI. The tool wrapper uses the same defaults as the SearchGUI application.

mvaudel commented 8 years ago

That is great, thanks JJ. I will correct the COA to include the advanced settings and avoid confusion.

hbarsnes commented 7 years ago

Hi all,

The certificate of analysis has now been corrected, ensuring that the X! Tandem refinement modifications are reported correctly. A new PeptideShaker version will be released shortly.

Best regards, Harald

hbarsnes commented 7 years ago

Hi all,

PeptideShaker v1.14.3 has just been released which should fix the problem with the X! Tandem refinement modifications being incorrectly reported. If this is not the case, please let us know and we'll reopen the issue.

Best regards, Harald