ksahlin / strobealign

Aligns short reads using dynamic seed size with strobemers
MIT License
128 stars 16 forks source link

Make index creation deterministic #398

Closed marcelm closed 4 months ago

marcelm commented 4 months ago

By ensuring randstrobes are sorted by all their fields.

I thought that the comparison function I added in #386 (commit 2e4ff9500e68d6e465735dd276d362cf71851dcd) was good enough, but now that I ran on some other datasets that I hadn’t used for testing previously, it became apparent that that’s not the case because I was getting non-reproducible results.

This slows down index creation again a little bit (now takes ~61s instead of ~60s for CHM13), but it’s still a lot faster than before parallel sorting.

marcelm commented 4 months ago

Merging as I need this for further experiments