minwordmatches (k-mer pre-filtering) has no effect
When aligning sequences, identical symbols will receive a positive match score (default +2). Aligning a pair of symbols where at least one of them is an ambiguous symbol (BDHKMNRSVWY) will always result in a score of zero.
So the alignment score should be low when compared to the alignment length for N-rich queries. With the --userout output option, it is possible to access these alignment parameters:
A user submitted this case on the vsearch forum:
How to avoid or detect this kind of matches?
When aligning sequences, identical symbols will receive a positive match score (default +2). Aligning a pair of symbols where at least one of them is an ambiguous symbol (BDHKMNRSVWY) will always result in a score of zero.
So the alignment score should be low when compared to the alignment length for N-rich queries. With the
--userout
output option, it is possible to access these alignment parameters:Indeed, the alignment length is 23, the number of matches is 23, and yet the raw score is only 2, indicating an alignment with 21 ambiguous symbols.