dkoslicki / CMash

Fast and accurate set similarity estimation via containment min hash
BSD 3-Clause "New" or "Revised" License
42 stars 9 forks source link

Figure out what to do with the reverse complements #2

Closed dkoslicki closed 4 years ago

dkoslicki commented 6 years ago

Add canonical kmers to the sketch? Or just standard kmers? Add canonical/std kmers to the tree? To the prefilter? For prefix searches?

dkoslicki commented 4 years ago

This is addressed in c8c91f4d3c7437ac0a9bbda7faf8c7fcb528e488, but leaving open until we get a more complete testing environment open as per #14 as only checked with local testing.

dkoslicki commented 4 years ago

Closing, since solution is: Sketches contain k-mers, query, bloom filter, and TST all contain rev-comps.