lh3 / miniprot

Align proteins to genomes with splicing and frameshift
https://lh3.github.io/miniprot/
MIT License
310 stars 16 forks source link

aligning to Oxford Nanopore simplex reads (Q16) #28

Closed avilella closed 1 year ago

avilella commented 1 year ago

Hi,

I have some low-coverage ONT simplex reads of about Q16 data quality and a few duplex reads at Q30 quality from the same Flongle run.

I can align proteins (from the same species or close) to the duplex reads, but some of the simplex reads are longer and would contain hits that are not present in the simplex reads.

Is there a combination of parameters that would allow me to align the proteins (400-500aa) to the simplex reads?

miniprot -G 500 fast5_all.fasta aa.queries.faa

Thanks in advance.

--

The equivalent mapping of the DNA sequences (there are two exons with one small intron) to the simplex reads with minimap2 also renders zero alignments. Same in the case where I attempt to map the second exon only, no hits:

minimap2 -x map-ont fast5_all.fasta dna.queries.fasta

minimap2 -x map-ont fast5_all.fasta dna.queries.second.exon.fasta
lh3 commented 1 year ago

I would reduce frameshift penalty via option -F.