marbl / MashMap

A fast approximate aligner for long DNA sequences
Other
268 stars 39 forks source link

fixed dummy window indicator #32

Closed AndreaGuarracino closed 3 years ago

AndreaGuarracino commented 3 years ago

Hi,

we forked MashMap to develop wfmash, a large-sequences aligner designed to accelerate the alignment step in the variation graphs induction.

Digging into how MashMap works, I ran into a possible small bug. If the first kmer of a sequence is considered as a minimizer, it will have position 0, and the same kmer will not be considered in other positions as 0 coincides with the dummy window indicator.

Using this sequence

>CIAO
CACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACACAC

and this command line

mashmap -r CIAO.fasta -q CIAO.fasta -s 500

I get

...
INFO, skch::Sketch::build, minimizers picked from reference = 1
...

with the master, and

...
INFO, skch::Sketch::build, minimizers picked from reference = 43
...

with the fix_dummy_window_indicator.

Could you please take a look?

cjain7 commented 3 years ago

Good catch!! Thanks for finding and reporting this bug.