dib-lab / 2020-paper-sourmash-gather

Here we describe an extension of MinHash that permits accurate compositional analysis of metagenomes with low memory and disk requirements.
https://dib-lab.github.io/2020-paper-sourmash-gather
Other
8 stars 1 forks source link

what is our estimate of the number of total k-mers in genbank? #30

Open ctb opened 3 years ago

luizirber commented 3 years ago

I can work on this, I need it to validate the cardinality estimation of the original dataset using scaled minhash (and the cardinality estimation is necessary for the new containment equation discussed in the theory section #13)