Closed julianu closed 2 months ago
I updated the comment for the initial PR, as there were some further additions to it.
Attention: Patch coverage is 15.38462%
with 11 lines
in your changes missing coverage. Please review.
Project coverage is 63.97%. Comparing base (
6e51896
) to head (5d01b6f
). Report is 2 commits behind head on main.
Files | Patch % | Lines |
---|---|---|
psm_utils/io/xtandem.py | 15.38% | 11 Missing :warning: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
[edited after adding fix for PSM parsing]
As XTandem's protein names tend to be abbreviated in the protein "label" tag, change the origin to the "note" tag.
While XTandem saves only the highest scoring PSMs per spectrum, these can still be more than one PSM, with different peptidoforms, if the score is exact the same. This is not an extremely rare case, especially with equal peptides (think of a single AA flip in the sequence). This fix parses the identifications with same peptidoforms into one new PSM, with only the relevant proteins assigned to each PSM. Before, there were weird matches of proteins to peptides, which did not occur in the databases used by XTandem.
Also, it seems as the remark that only one protein per peptide/PSM is parsed is thus not true anymore.