dnbaker / dashing2

Dashing 2 is a fast toolkit for k-mer and minimizer encoding, sketching, comparison, and indexing.
MIT License
62 stars 7 forks source link

Fix setsketch compression for 32-bit integers. #18

Closed dnbaker closed 3 years ago

dnbaker commented 3 years ago

Previously, the b and a parameters for set sketch compression were coerced into the double type. By maintaining long double precision, we're able to restore accurate distance computation between BagMinHash and ProbMinHash sketches using logarithmic compression even for 32-bit integers.