dnbaker / dashing

Fast and accurate genomic distances using HyperLogLog
GNU General Public License v3.0
160 stars 11 forks source link

Weighted Jaccard #25

Closed dnbaker closed 5 years ago

dnbaker commented 5 years ago

This pull request consists of count-min sketch-faciliated streaming weighted set comparisons, which works for all supported data structures, including CLI for both sketch and distance.

This should be incorporated into functions determining filenames to which to cache sketches, but this is okay for now.