PNNL-Comp-Mass-Spec / Informed-Proteomics

Top down / bottom up, MS/MS analysis tool for DDA and DIA mass spectrometry data
29 stars 9 forks source link

Why does the identified PrSM decrease a lot after using the modified file? #18

Open sunyusui opened 5 years ago

sunyusui commented 5 years ago

The histone H3.1 I used contains a total of 3460 spectra, and the database human_proteome_database.fasta contains 20410 entries. When I did not use the modified file, the result output 1310 PrSM, the parameters are as follows: SpecFile 2DLC_H3_1.pbf DatabaseFile human_proteome_database.fasta FeatureFile 2DLC_H3_1.ms1ft InternalCleavageMode SingleInternalCleavage Tag-based search True Tda Target+Decoy PrecursorIonTolerancePpm 10 ProductIonTolerancePpm 10 MinSequenceLength 21 MaxSequenceLength 300 MinPrecursorIonCharge 2 MaxPrecursorIonCharge 30 MinProductIonCharge 1 MaxProductIonCharge 20 MinSequenceMass 3000 MaxSequenceMass 50000 ActivationMethod Unknown MaxDynamicModificationsPerSequence 0

When I use the modified file, only 59 PrSMs are output, and the parameters are as follows: SpecFile 2DLC_H3_1.pbf DatabaseFile human_proteome_database.fasta FeatureFile 2DLC_H3_1.ms1ft InternalCleavageMode SingleInternalCleavage Tag-based search True Tda Target+Decoy PrecursorIonTolerancePpm 10 ProductIonTolerancePpm 10 MinSequenceLength 21 MaxSequenceLength 500 MinPrecursorIonCharge 2 MaxPrecursorIonCharge 50 MinProductIonCharge 1 MaxProductIonCharge 20 MinSequenceMass 3000 MaxSequenceMass 50000 ActivationMethod Unknown MaxDynamicModificationsPerSequence 4 Modification C(2) H(2) N(0) O(1) S(0),R,opt,Everywhere,Acetyl Modification C(2) H(2) N(0) O(1) S(0),K,opt,Everywhere,Acetyl Modification C(1) H(2) N(0) O(0) S(0),R,opt,Everywhere,Methyl Modification C(1) H(2) N(0) O(0) S(0),K,opt,Everywhere,Methyl Modification C(2) H(4) N(0) O(0) S(0),R,opt,Everywhere,Dimethyl Modification C(2) H(4) N(0) O(0) S(0),K,opt,Everywhere,Dimethyl Modification C(3) H(6) N(0) O(0) S(0),R,opt,Everywhere,Trimethyl Modification C(0) H(1) N(0) O(3) S(0) P(1),S,opt,Everywhere,Phospho Modification C(0) H(1) N(0) O(3) S(0) P(1),T,opt,Everywhere,Phospho Modification C(0) H(1) N(0) O(3) S(0) P(1),Y,opt,Everywhere,Phospho

The modification file I am using is as follows:

This file is used to specify modifications for MSPathFinder

Max Number of Modifications per peptide

NumMods=4

Static mods

None

Dynamic mods

C2H2O1,RK,opt,any,Acetyl # Acetylation RK CH2,RK,opt,any,Methyl # Methylation RK C2H4,RK,opt,any,Dimethyl C3H6,R,opt,any,Trimethyl HO3P,STY,opt,any,Phospho # Phosphorylation STY

Is there a problem with my parameter settings, which leads to this situation? Is there a normal event, only 59 prsm can be identified for such input?

alchemistmatt commented 5 years ago

I suspect the issue is too many dynamic modifications on the same residues, which leads to too many possible peptides to score. I suggest searching for those modifications separately.

Search 1: C2H2O1,RK,opt,any,Acetyl # Acetylation RK Search 2: CH2,RK,opt,any,Methyl # Methylation RK Search 3: C2H4,RK,opt,any,Dimethyl Search 4: C3H6,R,opt,any,Trimethyl Search 5: HO3P,STY,opt,any,Phospho # Phosphorylation STY

If you get lots of results form Search 3 and Search 5 (for example), you could try combining them to give Search 6: C2H4,RK,opt,any,Dimethyl HO3P,STY,opt,any,Phospho # Phosphorylation STY