bbuchfink / diamond

Accelerated BLAST compatible local sequence aligner.
GNU General Public License v3.0
994 stars 183 forks source link

Applying Diamond to short peptide searching #737

Open ysbioinfo opened 9 months ago

ysbioinfo commented 9 months ago

Hi Diamond developers, Thanks for developing such an awesome tool. I'm using Diamond to analyze mass spec data and all of them are very short peptides (usually 8-12 aas). I want to see if there is an almost-perfect alignment (no more than 1 mismatched aa) between a short peptide and a protein. However, when I run Diamond using the default mode or the more-sensitive mode, it reports 0 pairwise alignment even though some short peptides have an exact matching with the protein database. I guess some of my parameters are not appropriate, or Diamond does not accommodate to short sequence searching. Could you give me some suggestions? Thanks in advance.

Below are two examples: FSVLLHRV exactly match this protein "sp|P19823|ITIH2_HUMAN", but Diamond did not report it. SLLDKLLQ have only 1 mismatched base pair to this protein "sp|A0PJX8|TMM82_HUMAN", but Diamond did not report it.

wook2014 commented 9 months ago

Is #156 and #158 helpful with your problem?

bbuchfink commented 9 months ago

You would have to use very short seeds (like --shape-mask 11111) and also change a number of internal cutoffs (see advanced options in the command line help) for this to work.