lh3 / miniprot

Align proteins to genomes with splicing and frameshift
https://lh3.github.io/miniprot/
MIT License
310 stars 16 forks source link

How to achieve higher sensitivity? #30

Open zhaijj opened 1 year ago

zhaijj commented 1 year ago

Hi,

I was using miniprot to align one protein in Sorghum to maize genome, but no matter how I changed the parameters, only one alignment was returned. But if I used tblastn with default parameter, it can return 12 alignments.

So could you point out to me which parameters are responsible for the sensitivity of miniprot?

lh3 commented 1 year ago

What have you tried? How many exons does the gene have? Tblastn doesn't do splice alignment.

zhaijj commented 1 year ago

I tried to lower all gap penalty with the following command:

miniprot -O 1 -J 1 -F 1 -N 1000 --outn=1000

I also tried -S to do alignment without splicing, but it returns the same alignment.

Only one CDS (two exons, the first one was UTR) in my query gene.

jfouret commented 4 months ago

Hi,

I managed to get very high sensitivity with the following parameters: -k 3 -L 5 -O 5 -n 1 -N 1000 -l 3 -E 0 -J 5 -F 8 -B 5 --outs 0.5 --outn 1

I think key parameters that you did not use are -n -k -l and mostly --outs, I suggest that you use --outc also in your case.

However in my case having one and only one mapping per query was a good assumption (I was mapping metapneumovirus proteins on Flu 1 genome). I was just doing that to evaluate the tool capacity to annotate unknown viruses (which is good).