schatzlab / genomescope

Fast genome analysis from unassembled short reads
Apache License 2.0
258 stars 56 forks source link

Should a larger K value be chosen? #133

Open GLking123 opened 4 months ago

GLking123 commented 4 months ago

Hello,

When I used Kmer=21, the genomescope graph is as follows: image

When I used Kmer=27, the genomescope graph is as follows: image

Flow cytometry estimated the genome size to be around 50G, and the assembled genome is also around 50G. However, this is significantly different from the above genomescope graph. Should I increase the kmer value? For example, to 31?

For the above question, could you provide some debugging suggestions? Thank you for your valuable time and assistance. I sincerely look forward to your response!

mschatz commented 3 months ago

GenomeScope reports the haploid genome size, but it looks like you have a tetraploid. So the estimated genome size here would need to be multiplied by 4 to compute the total DNA content, e.g. for a human sample it will report 3Gbp for the genome size but this needs to be multiplied by 2 to reach the total content. Otherwise you may need to adjust the kmer counting to account for the very high frequency kmers. This often gets truncated at 1000x or 10000x but you will need to push this out to 100,000x or higher to capture the most abundant repeats

Good luck!

Mike

On Tue, Jun 25, 2024 at 9:29 AM GLking123 @.***> wrote:

Hello,

When I used Kmer=21, the genomescope graph is as follows: image.png (view on web) https://github.com/schatzlab/genomescope/assets/71629239/e82442a7-bf98-42dc-a045-a7be405e5de9

When I used Kmer=27, the genomescope graph is as follows: image.png (view on web) https://github.com/schatzlab/genomescope/assets/71629239/055b224c-c15f-4bb3-9701-0506ae48f405

Flow cytometry estimated the genome size to be around 50G, and the assembled genome is also around 50G. However, this is significantly different from the above genomescope graph. Should I increase the kmer value? For example, to 31?

For the above question, could you provide some debugging suggestions? Thank you for your valuable time and assistance. I sincerely look forward to your response!

— Reply to this email directly, view it on GitHub https://github.com/schatzlab/genomescope/issues/133, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABP344HHPGBL24LUEKQ2TDZJFWD3AVCNFSM6AAAAABJ33SVYCVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM3TENZQHE2TKOA . You are receiving this because you are subscribed to this thread.Message ID: @.***>