ksahlin / strobealign

Aligns short reads using dynamic seed size with strobemers
MIT License
128 stars 16 forks source link

Do more hamming distance attempts? #419

Open ksahlin opened 2 months ago

ksahlin commented 2 months ago

I tested increasing -M (default 20) to 80, this improves accuracy a bit but reduce speed, as expected. We currently try 20 locations regardless of hamming or SSW was invoked.

It may be beneficial in runtime/accuracy tradeoff if we could keep two separate counters, one for SSW and one for hamming. The idea is that we could try more sites if only hamming distance is needed. I propose to keep the threshold of around 10-20 for SSW per read, but implement a separate counter and threshold (set to e.g. 80 or 100) if only hamming is invoked.

I think the shortest reads will benefit the most from this.