BenLangmead / bowtie

An ultrafast memory-efficient short read aligner
Other
259 stars 77 forks source link

Maximum mismatches allowed is 2 #89

Closed johanzi closed 4 years ago

johanzi commented 5 years ago

The maximum mismatch parameter for -n/--seedmms is 3 but in fact, Bowtie will display hits for a sequence containing a maximum of 2 mismatches in the seed. I discovered this in v0.12.7 but it is actually the same problem in the last release (v1.2.2). I used the command bowtie <genome_prefix> -n 3 -c <28-mers DNA sequence> and increased gradually the number of mismatches in my query sequence. I tried on sequences longer than the seed default length (28) and the problem remains. I think Bowtie seems to be the most fit for looking at off-targets in the case of CRISPR/Cas9 sgRNA design (basically 20-mers sequences) so it would be nice to have the possibility to allow 3 mismatches (and increase certainty about not hitting something else than the sequence of interest). Thanks!

ch4rr0 commented 5 years ago

We've confirmed that this is in fact an issue. Unfortunately we were not able to find a solution in time for this new release. We're working still working on the issue and will have a new release as soon as the issue is fixed.

ch4rr0 commented 4 years ago

While we thought this was an issue at first look, we discovered that in order to get bowtie to report 3 mismatch reads you will need to up the --maperr from the default 70 to, say, 90.

johanzi commented 4 years ago

Thanks for the update!