marbl / Winnowmap

Long read / genome alignment software
Other
235 stars 22 forks source link

Try to understand the super long soft/hard clipping in the raw reads mapping #33

Open WenyuLiang opened 1 year ago

WenyuLiang commented 1 year ago

Hi! Thanks for this great tool. I mapped the Nanopore raw reads to CHM13 and found many primary alignment reads have very long soft/hard clipping bases in the outputs. To my understanding, clipped bases are those can't be mapped to the reference. It would be acceptable if the clipped bases are within 100 bases but it really perplexes to have so much difference between the reads and the reference. grep 'tp:A:P' chm13raw.sam | awk '{print $6}' | grep --color '[HS]'

Screen Shot 2022-08-14 at 2 06 08 PM