We want some filtering capabilities for peptide characteristics.
From Rui:
Peptides identified in DIA data
Filter out those peptides that contain Cysteine
Filter out those peptides that contain Methionine
Filter out internal Lysine/Arginine
Filter redundancy in charge state
I would also add now that ragged ends (aka consecutive cleavage sites, that is KK, RR, RK and KR) are to be avoided. For example in the protein ADFGHKKEFG, we would filter out ADFGHKK peptide, but also ADFGHK, because trypsin will sometimes cut after the first K and sometimes after the second, making even the properly tryptic peptide more quantitatively unreliable.
[x] Acquisition type must be filled in on dataset or no data storage (make a quick method in analysis?)
[ ] Two regex textareas (pos and neg filter) for multiple re, e.g. .*[M].*, [A-Z]+[KR][A-Z]+
[x] ADFGHK case is hard, we need protein context for that, so store proteins that are used for each peptide and their start site, visualize so users can decide?
[x] Store charge state of PSMs
[x] Make sure we can filter on acq.type, LF, isobaric
We want some filtering capabilities for peptide characteristics.
From Rui:
I would also add now that ragged ends (aka consecutive cleavage sites, that is KK, RR, RK and KR) are to be avoided. For example in the protein ADFGHKKEFG, we would filter out ADFGHKK peptide, but also ADFGHK, because trypsin will sometimes cut after the first K and sometimes after the second, making even the properly tryptic peptide more quantitatively unreliable.