schatzlab / genomescope

Fast genome analysis from unassembled short reads
Apache License 2.0
249 stars 56 forks source link

Parameter: Average k-mer coverage for polyploid genome #102

Open manoharbisht1998 opened 1 year ago

manoharbisht1998 commented 1 year ago

Thanks for this awesome tool

I am working on a Tetraploid plant with an estimated genome size of around 1.2 Gbp. So when I run Genomescope on my data with K 21, ploidy 4 and Average k-mer coverage for polyploid genome as -1 (default) I got the genome size of 400 Mbp (http://qb.cshl.edu/genomescope/genomescope2.0/analysis.php?code=BnJHp51q3XrTIV0M4SQs). But when I run genomescope with the same parameter as above except Average k-mer coverage for polyploid genome : 20 it shows a genome size of 1.13Gbp which is close to the estimated (http://qb.cshl.edu/genomescope/genomescope2.0/analysis.php?code=plKuQWfgjEvYuACsvkFX). So now the question is what value of Average k-mer coverage for polyploid genome parameter should we set to get the correct result.

Thank you

tbenavi1 commented 1 year ago

The second plot you linked to above is definitely not correct, based on the fit of the model being poor. And as for the first plot, the estimated size of 300Mbp from GenomeScope would correspond exactly to a 1.2Gbp polyploid genome. For example, for a human GenomeScope will show an estimated size of 3 Gbp which corresponds to the 6 Gbp diploid genome.

manoharbisht1998 commented 1 year ago

Thank you for the quick and constructive comment. However, the genome haploid size of 300 mb is corresponding to a 1.2 Gbp polyploid genome is not clear to me. As my species genome size according to C Value database for1C (Mbp) value is 1.2Gbp which is a haploid estimate, so how come the 300 Mbp haploid genome size will corresponds to this?

Thanks