Open lychen83 opened 3 years ago
Thanks for your interest. This is a bad fit, and is usually because the data have extensive amounts of sequencing errors or perhaps contamination present. Unfortunately there is not much that can be done to overcome situations like this other than to collect additional data
Good luck
Mike
On Sat, Feb 6, 2021 at 12:53 AM lychen83 notifications@github.com wrote:
Dear all,
I have Illumina data (150 *2 bp), in total, 110 Gb. I cleaned it with Trimmomatics. I used Genomescope to estimate the genome size with kmer =
- The estimated genome size is about 1.1G. However, the proportion of errors are high as 2.2 percent. The het is high as 4.65 percent. When I just use 60G data, Genomescope 'Failed to converge' I have used Genomescope for many species. I never found this problem before.
Why does it have a high proportion of errors?
I appreciate your help.
Best,
Chen
[image: enh_plot] https://user-images.githubusercontent.com/28940942/107110407-8026dd80-6882-11eb-82ee-b0dbd2d9ccf5.png
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/schatzlab/genomescope/issues/52, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABP34YWSAELZ7XYRQBOJ7DS5TKN7ANCNFSM4XF5WBTA .
Thank Mike,
Is it possible due to the high heterozygosity of my genome, which caused the high rate of errors?
Best,
Lingyun Chen
Im sure that is contributing to the problem, but it seems to be more than just high heterozygosity.
Good luck
Mike
On Mon, Feb 8, 2021 at 9:49 PM lychen83 notifications@github.com wrote:
Thank Mike,
Is it possible due to the high heterozygosity of my genome, which caused the high rate of errors?
Best,
Lingyun Chen
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/schatzlab/genomescope/issues/52#issuecomment-775617620, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABP34ZZ4AG5HP2SMOWTQNTS6CPFPANCNFSM4XF5WBTA .
Hi Lingyun @lychen83,
I am having a very similar plot as yours. Did you collect additional data and did more data give you a better plot?
Thanks! Ying
I have a species that I sequenced 200 Gb for genomescope. Howevever, it still failed. I guess it might be problem beyond the data size
Best, Lingyun
Lingyun Chen @.***
------------------ Original ------------------ From: "schatzlab/genomescope" @.>; Date: Fri, Sep 15, 2023 03:02 AM @.>; Cc: "Lingyun @.**@.>; Subject: Re: [schatzlab/genomescope] High rate of errors, real Illumina sequencing error? (#52)
Hi Lingyun @lychen83,
I am having a very similar plot as yours. Did you collect additional data and did more data give you a better plot?
Thanks! Ying
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
Thank you Lingyun for your reply! That's scary to hear. Did your assembly work?
Thanks again! Ying
Dear all,
I have Illumina data (150 *2 bp), in total, 110 Gb. I cleaned it with Trimmomatics. I used Genomescope to estimate the genome size with kmer = 21. The estimated genome size is about 1.1G. However, the proportion of errors are high as 2.2 percent. The het is high as 4.65 percent. When I just use 60G data, Genomescope 'Failed to converge' I have used Genomescope for many species. I never found this problem before.
Why does it have a high proportion of errors?
I appreciate your help.
Best,
Chen