ProteoBench is an open and collaborative platform for community-curated benchmarks for proteomics data analysis pipelines. Our goal is to allow a continuous, easy, and controlled comparison of proteomics data analysis workflows.
So do we completely ignore the field Mapped Proteins?
In the combined.tsv table, there are two columns that we need to concatenate to get the protein groups: Proteins and Mapped Proteins. If we only consider the Proteins, we only have one accession from the group. This is not what we do for the other pipelines. This clearly overestimates the quantification error because peptide sequences that match several species will be considered in its calculation.
Protein protein sequence header corresponding to the identified peptide sequence; this will be the selected razor protein if the peptide maps to multiple proteins (in this case, other mapped proteins are listed in the ‘Mapped Proteins’ column)
So what we need to do is:
get the value from the column Proteins, concatenate with the value of Mapped Proteins, and use this as protein group identifyer. @brvpuyve correct me if I am wrong.
For information, when there are several accessions in Mapped Proteins, these are separated by ",".
I don't think that we parse correctly the ions in FragPipe. In the .toml, I see this:
So do we completely ignore the field
Mapped Proteins
? In the combined.tsv table, there are two columns that we need to concatenate to get the protein groups:Proteins
and Mapped Proteins. If we only consider theProteins
, we only have one accession from the group. This is not what we do for the other pipelines. This clearly overestimates the quantification error because peptide sequences that match several species will be considered in its calculation.Here is what I get from the FragPipe output file description (https://fragpipe.nesvilab.org/docs/tutorial_fragpipe_outputs.html#combined_iontsv):
So what we need to do is: get the value from the column
Proteins
, concatenate with the value ofMapped Proteins
, and use this as protein group identifyer. @brvpuyve correct me if I am wrong. For information, when there are several accessions inMapped Proteins
, these are separated by ",".