Closed cjchen5 closed 1 year ago
Hello,
The "best_k.sh" is to help getting the minimum k size to generate read-db.meryl, given the genome size.
The "k-mer threshold" is for getting a cutoff for obtaining a reliable k-mer subset from read-db.meryl. The threshold is automatically determined given the k-mer histogram of read-db.meryl in Merqury.
So yes, you need sequencing reads to generate read-db.meryl. Once you know the "best k", prepare meryl-dbs using the "best k" as in the document with meryl count
.
Merqury will do the rest, in most cases, unless the histogram is somewhat unexpected.
Thanks, Arang
Hi, For genernate read-db.meryl, I found here (https://github.com/marbl/merqury/wiki/1.-Prepare-meryl-dbs) mentioned best_k.sh but this seems only consider genome size. However, in "Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies" shows:
Does this need sequencing reads? If so, how should I apply sequencing reads with 'merqury' when I calculate best k?
Thanks!