Open sjfleck opened 11 months ago
I agree the plots are confusing. Im guessing everything between 20x to 100x represents some level of heterozygosity while the major peak at 150 represents homozygous kmers. I agree it would help to run PurgeHaplotigs, and would start with values around -l 100 -m 200 -h 500, but you should try several values to see how it impacts the BUSCO score. Fortunately, PurgeHaplotigs should only take a few minutes to run.
Good luck! Mike
On Mon, Oct 30, 2023 at 2:04 PM sjfleck @.***> wrote:
Hello, thank you for GenomeScope2. It's been a useful tool along with SmudgePlot. I have one GenomeScope profile that I'm not sure if the model is accurate. The 1n hump isn't there in the observed data, but it is . There is a clear 2n and 4n hump and this was a proposed tetraploid in SmudgePlot. I will share the plots below:
[image: Ua_GS2_SP] https://user-images.githubusercontent.com/53409202/279149128-e3b40098-4a4f-4fb1-8554-4118668d3574.png
I'm checking in on this because this species have 53% duplicated BUSCOs and I was planning on running Purge Haplotigs on it to reduce to a haploid assembly (as long as it's not already one). If there is a heterozygous peak, it should be ~72, but I'm not seeing one to designate for Purge Haplotigs to work on. Any insights into this or recommendations would be greatly appreciated. Thank you.
— Reply to this email directly, view it on GitHub https://github.com/schatzlab/genomescope/issues/113, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABP345V65TTY3D5ZM2ANDDYB7TZRAVCNFSM6AAAAAA6WOBWTGVHI2DSMVQWIX3LMV43ASLTON2WKOZRHE3DQOJUGEZTKNI . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Hello, thank you for GenomeScope2. It's been a useful tool along with SmudgePlot. I have one GenomeScope profile that I'm not sure if the model is accurate. The 1n hump isn't there in the observed data, but it is in the full model and unique sequences. There is a clear 2n and 4n hump and this was a proposed tetraploid in SmudgePlot. I will share the plots below:
I'm checking in on this because this species have 53% duplicated BUSCOs and I was planning on running Purge Haplotigs on it to reduce to a haploid assembly (as long as it's not already one). If there is a heterozygous peak, it should be ~72, but I'm not seeing one to designate for Purge Haplotigs to work on. Any insights into this or recommendations would be greatly appreciated. Thank you.