lh3 / miniprot

Align proteins to genomes with splicing and frameshift
https://lh3.github.io/miniprot/
MIT License
310 stars 16 forks source link

Question: using Miniprot for searching small peptides #26

Open andreas-wilm opened 1 year ago

andreas-wilm commented 1 year ago

Hi @lh3, many thanks for developing Miniprot!

I was wondering if Miniprot could also be (ab)used to search for very short sequences/peptides (say, even down to 8AA) in large genomes. I successfully used the provided test data trimming the protein down to just MADTQYILP with -n 1 -l 4 -k 4. Now the weird thing is that I can't find any control peptides (fragments of Myosin or Helicases up to lengths 100) when searching against a full genome (refseq grch38 p14 in this case). Is this expected? Would adjusting parameters help and if yes, which in addition to the above?

Many thanks, Andreas

PS: I admit, this is a case of abusing the software.