martinry / proteasy

GNU General Public License v3.0
2 stars 0 forks source link

findProtease function for other organisms #2

Closed taps99 closed 7 months ago

taps99 commented 1 year ago

I'm trying to use the findProtease function to map a list of peptides to proteases in Klebsiella pneumoniae. However, I keep getting the "No accessions could be mapped" error message. Any idea on what I could do? I've already confirmed that the peptide is present in the protein sequence, but it is still generating that error.

Thank you.

martinry commented 1 year ago

Hi, would you be able to send an example of your input (peptide sequence + protein accession)?

taps99 commented 1 year ago

I've attempted to use the function with a single peptide sequence and protein accession, as well as the complete list of peptides and protein accession numbers.

Here is an example of one of the peptides in my list with the associated protein accession number: Peptide: PALEACPQKR Protein accession number: W1HZS5 Organism: Klebsiella pneumoniae

martinry commented 1 year ago

The accession is not present in MEROPS, which is the database proteasy uses to find matching cleave sites. If you have a long list of accessions and you are unsure if they are present in MEROPS, use the UniProt ID mapping tool (UniProtKB to MEROPS). If they map, they should be valid in proteasy. If you do find a UniProt accession that maps to a MEROPS ID but does not return results in proteasy, let me know the accession and peptides seq. Also, if you don't get any hits, there are similar, web-based tools that use different/additional cleaving annotation sources: proteasix and topfinder.

taps99 commented 1 year ago

From 672 unique accession IDs, the following 9 IDs were mapped to a MEROPS ID. However, I noticed that these IDs are all different proteases. My goal is to take a list of non-tryptic peptides, and find what proteases might be responsible for their cleavage.

W1I2E8
W1HTW4
W1HQY8
W1HZN4
W1HW97
W1HYP2
W1HPD6
W1HVH5
W1HT22

martinry commented 1 year ago

You could perhaps try a similarity search for some conserved proteins using blast. Select UniProtKB/TrEMBL and (expand Other Protein Databases) MEROPS-MP. Enter your accessions + peptide seqs in FASTA format and use blastp. Then review high scoring matches Functional Predictions tab. I am unsure about the reliability of results inferred from homology, but if you think it might be useful, similar functionality could be implemented in proteasy.