There was an issue with the pairwise sequence aligner Du Novo was using to align single strand consensus sequences together.
The core issue was that it was considering Ns the same as actual, non-N bases. So it would consider two aligned Ns to be a match just as valuable as two aligned Cs. The end effect is an increased number of indels in regions with high error rates.
This was fixed in 272bb4939c (between 2.16 and 3.0) by adding the option to use another pairwise aligner: BioPython with a custom substitution matrix. Later, before the 3.0 release, this was made the default aligner.
Documenting another fixed bug here post facto.
There was an issue with the pairwise sequence aligner Du Novo was using to align single strand consensus sequences together.
The core issue was that it was considering
N
s the same as actual, non-N
bases. So it would consider two alignedN
s to be a match just as valuable as two alignedC
s. The end effect is an increased number of indels in regions with high error rates.Here are some examples: https://docs.google.com/presentation/d/1qeB27_3FfjSN31r9kGQhydx7TiVCNfh3s6-84Iu6O4A/edit?usp=sharing
This was fixed in 272bb4939c (between 2.16 and 3.0) by adding the option to use another pairwise aligner: BioPython with a custom substitution matrix. Later, before the 3.0 release, this was made the default aligner.