Open ctb opened 3 years ago
Contrived example where this would be the case: a "metagenome" with two genomes that have high ANI. The hashing gets "unlucky" and the sketches for the two genomes are identical (or near to it). Min-set-cov predicts only a single genome as a result.
Eg. Two genomes with 99% ANI and of length 4.5Mbp are expected (95% confidence interval) to share between 80.8% and 81.1% of 21-mers in common.
@dkoslicki comment on gather from luiz thesis: