lh3 / minimap2

A versatile pairwise aligner for genomic and spliced nucleotide sequences
https://lh3.github.io/minimap2
Other
1.76k stars 405 forks source link

hard clipping around large indels, parameter suggestions #960

Open rob234king opened 1 year ago

rob234king commented 1 year ago

I am mapping telomere containing reads in which the reference may have a large indel right at junction of telomere and non-telomere sequence and the reads are getting hard clipped but I want them to go across that indel into the telomere. I have tried asm10 and asm20, and then lowering gap penalty O7,11 which helps for some smaller indels but still not mapping the length of the read. Is there any suggestions on what parameters to get the length of the read to map with large indels into a repeat region for one part of the read?

version: 2.24-r1122

lh3 commented 1 year ago

How long is the gap? Is it insertion or deletion?

rob234king commented 1 year ago

It can be either depending upon subtelomere but typically I see a deletion in relation to the reference. Largest is 2kbp but most are below 500bp.

lh3 commented 1 year ago

I assume you are using long reads. HiFi or Nanopore? What is the read length? It would also be good to show me a couple of examples.

rob234king commented 1 year ago

Could you please share an email and I'll forward on some more detailed information with screenshots?