wwood / galah

More scalable dereplication for metagenome assembled genomes
GNU General Public License v3.0
48 stars 11 forks source link

dashing2 instead of dashing1 #16

Open jianshu93 opened 2 years ago

jianshu93 commented 2 years ago

Hello Ben,

I noticed that dashing1 is based on hyperloglog and sketch size can only be the power of 2. Dashing2 (https://github.com/dnbaker/dashing2) implements both mash, dashing1, and new hash algorithms such as setsketh, prominhash et.al. However, the output format is not perfect because it is under fast development. Personally I prefer MASH because it correlates very will with blast base ANI/fastANI with sketch size 10^5 or 10^6. What do you think.

Thanks,

Jianshu