MSGFPlus / msgfplus

MS-GF+ (aka MSGF+ or MSGFPlus) performs peptide identification by scoring MS/MS spectra against peptides derived from a protein sequence database.
Other
73 stars 36 forks source link

how to specify two enzymes #94

Closed liminghao663 closed 4 years ago

liminghao663 commented 4 years ago

My samples were digested with firstly Trypsin and then Lys-C. How should I set the -e parameter?

alchemistmatt commented 4 years ago

As described at https://msgfplus.github.io/msgfplus/MSGFPlus.html if you need to create custom enzyme definitions, you must create a file named enzymes.txt in a subdirectory named params below the directory with MSGFPlus.jar

An example enzymes.txt file can be found at https://msgfplus.github.io/msgfplus/examples/enzymes.txt

Examining your enzymes, Trypsin cleaves after K and R while Lys-C cleaves after K. Both enzymes cleave after the residue (towards the C-terminal end of the peptide). Since Lys-C cleaves after K, as does Trypsin, you don't need a custom enzymes file. Just use -e 1

I would also encourage you to consider using a configuration file (aka a parameter file) instead of a long list of command line arguments. This way, you'd start MS-GF+ like this:

java.exe -Xmx4000M -jar MSGFPlus.jar -s DatasetName.mzML -o DatasetName_msgfplus.mzid -d H_sapiens_Uniprot_SPROT_2019-02-22.fasta -conf MSGFPlus_PartTryp_MetOx_StatCysAlk_20ppmParTol.txt

You can find a collection of pre-configured parameter files at https://github.com/MSGFPlus/msgfplus/tree/master/docs/ParameterFiles