Closed realizor closed 3 years ago
Thanks for reporting this. I thought that MS-GF+ would report all of the proteins that have a peptide; I don't recall seeing it skip reporting proteins in the past. Until we can look into this for MS-GF+, I can suggest a workaround: use the ProteinCoverageSummarizer to find all of the proteins that contain a peptide. Steps:
The program will create a file listing the input peptides and every protein that has them.
I will work on updating the ProteinCoverageSummarizer to support reading the .tsv file from MzidToTsvConverter, which will remove the need to open the .tsv file with Excel, and will have the advantage that it can make a new .tsv with all of the columns.
Thank you so much @alchemistmatt ! Super helpful!
I have released a version of the Protein Coverage Summarizer that supports reading the .tsv file from MzidToTsvConverter and creating a new file that lists all of the proteins for each peptide. Give this a try: Release v1.3.7608
Relevant processing options to enable:
The new file name will end with _AllProteins.txt
Hi,
I noticed that in \<PeptideEvidence> section of MSGF+ output, a peptide is not linked back to all the proteins that can generate it.
For example, in my database (human), a total of 7 proteins can produce peptide "FGGPGTASRPSSSR". But only 2 proteins showed up in the \<PeptideEvidence> section for this peptide.
The 2 proteins reported above are TALONT000242860.p1 and TALONT000242936.p1
However, in my database fasta file, many more protein isoforms can also generate this peptide, for example:
I am new to MSGF+, could you please help me with this?
Thank you so much, Pan