For estimating the cardinality of kmer set from a genome, maximum likelihood method is preferred since it is more accurate (theoretical lower bound actually) but slower, is that possible to also have the improved estimator in Ertl 2017 paper (equation 10), which is also implemented in Dashing 1? It is as accurate as the traditional HLL estimator (bias correction is needed for small cardinality for traditional HLL).
Hi @luizirber and @ctb,
For estimating the cardinality of kmer set from a genome, maximum likelihood method is preferred since it is more accurate (theoretical lower bound actually) but slower, is that possible to also have the improved estimator in Ertl 2017 paper (equation 10), which is also implemented in Dashing 1? It is as accurate as the traditional HLL estimator (bias correction is needed for small cardinality for traditional HLL).
Thanks,
Jianshu