Aligning Nanopore reads against a highly similar reference.

coro1c commented 2 weeks ago

Hello,

We are doing high-throughput cloning of different gene variants and want to move the QC of our cloning products to Nanopore sequences. I did a trial run with a library of ~400 variants. The variants have large constant parts but differ in two 40 bp stretches.

When I try to align the Nanopore reads from a flongle (N50: 2.14kb, avg Q-score: 18), I get large alignment errors most likely due to the reference being so similar. I tried several flags (-U, -f to limit kmers in common regions, -G to limit gap length to a few bp, ...). However, my alignment either crashes or gives low quality alignments in the variable regions (in the picture: ~350 and ~1350 bp).

Can you maybe give some advice/recommendations, how I can best tackle this alignment problem?

Many thanks in advance.

Best, Marie

lh3 commented 1 week ago

Are you using 400 sequences as the reference? What command line are you using exactly? Why do you think the alignment is wrong, instead of the cloning being wrong?

coro1c commented 1 week ago

Yes, I use 400 sequences as reference. I am using the default settings of EPI2ME wf-alignment, so I think these are the -x map-ont settings.

I checked for a some reads manually and they actually map back to another reference with some indels. So I cannot fully rule out cloning errors but for all reads I checked (~20) I got actually a better fitting reference. I think the largest problem are actually indels due to the low Q score of 18.

Many thanks for your help.

lh3 commented 1 week ago

Please run minimap2 independently and provide the exact command line. Also what do you mean by "better"? Is the alignment score higher? Eyeballing doesn't really count.

lh3 / minimap2

Aligning Nanopore reads against a highly similar reference. #1251