schatzlab / genomescope

Fast genome analysis from unassembled short reads
Apache License 2.0
251 stars 56 forks source link

How to set ploidy with Genomescope v1 ? #62

Open ptranvan opened 3 years ago

ptranvan commented 3 years ago

Hi, Thanks for your software.

I have more accurate genome size estimation with Genomescope v1.

V1:

http://qb.cshl.edu/genomescope/analysis.php?code=S3AEp7xDGl31cq4agjOa

V2

http://qb.cshl.edu/genomescope/genomescope2.0/analysis.php?code=nnC4CPmgLE3605rbyM7y

But with V1 it's automatically diploid. How can I set to be triploid ?

Thanks

mschatz commented 3 years ago

Thanks for your interest. The original GenomeScope (GenomeScope 1) only had support for diploid species. Fortunately the new GenomeScope 2 has support for higher ploidies: https://www.nature.com/articles/s41467-020-14998-3 http://qb.cshl.edu/genomescope/genomescope2.0/

Good luck!

Mike

On Fri, Sep 3, 2021 at 7:38 AM Patrick Tran Van @.***> wrote:

Hi, Thanks for your software.

I have more accurate genome size estimation with Genomescope v1. I am using the R package:

Rscript genomescope.R kmcdb_k21.hist

V1: https://ibb.co/YT1CF11 V2 (using the website) https://ibb.co/L9XLs1V But with V1 it's automatically diploid. How can I set to be triploid ? Thanks — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub , or unsubscribe . Triage notifications on the go with GitHub Mobile for iOS or Android .
ptranvan commented 3 years ago

Thanks @mschatz.

Even if I set ploidy=2 in v2 I have different result from V1. How can we explain the difference ? Are there any parameters I can play with ?

mschatz commented 3 years ago

GS1 uses a more basic model fitting procedure that can sometimes get confused but if you post the links I can help interpret and adjust as needed

Cheers Mike

On Fri, Sep 3, 2021 at 9:07 AM Patrick Tran Van @.***> wrote:

Thanks @mschatz https://github.com/mschatz.

Even if I set ploidy=2 in v2 I have different result from V1. How can we explain the difference ? Are there any parameters I can play with ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/schatzlab/genomescope/issues/62#issuecomment-912525378, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABP342JBXJL6XL7AEEAPWTUADCBJANCNFSM5DLSH5GA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

ptranvan commented 3 years ago

Thanks for your help.

Here are the links:

V1:

http://qb.cshl.edu/genomescope/analysis.php?code=S3AEp7xDGl31cq4agjOa

V2

http://qb.cshl.edu/genomescope/genomescope2.0/analysis.php?code=nnC4CPmgLE3605rbyM7y

mschatz commented 3 years ago

Thanks for the links - I think the core issue is GS1 is only designed for diploid samples, but this is clearly a triploid sample. For both, GS1 and GS2 the "genome length" that is reported is the haploid genome length. For example, in human it reports the length as 3.0Gb, even though a diploid cell really contains about 6Gbp of DNA. So when GS1 analyzes the peaks it thinks the first peak (at ~50x) represents heterozygous kmers and the second peak (at ~100x) are homozygous kmers and everything beyond this are repeats of various forms. But GS2 thinks both the first (~50x) and second (~100x) peaks represent different heterozygous kmers, but the third peak (~150x) are homozygous. So GS2 has a smaller genome size because it thinks there are fewer repeats and because the more of the kmers are heterozygous and hence only fractionally contribute to the haploid genome size. For this reason, I would have more confidence in the GS2 results. Does that make sense?

Cheers Mike

On Fri, Sep 3, 2021 at 9:48 AM Patrick Tran Van @.***> wrote:

Thanks for your help.

Here are the links:

V1:

http://qb.cshl.edu/genomescope/analysis.php?code=S3AEp7xDGl31cq4agjOa

V2

http://qb.cshl.edu/genomescope/genomescope2.0/analysis.php?code=nnC4CPmgLE3605rbyM7y

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/schatzlab/genomescope/issues/62#issuecomment-912553972, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABP3436WRVBRLAYCJM3AJTUADG4HANCNFSM5DLSH5GA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.