schatzlab / genomescope

Fast genome analysis from unassembled short reads
Apache License 2.0
251 stars 56 forks source link

Genomescope does not converge even with a clear peak #73

Open bjarnebartlett opened 2 years ago

bjarnebartlett commented 2 years ago

Hello,

I ran Genomescope on a plant genome, however, the model did not converge even with a distinct peak present. Is there any solution for this?

The full results are here: http://genomescope.org/analysis.php?code=ARAt98SyZaNnWCd8WubG

plot log

jellyfish.kmers.25.asm.fa.histo.txt

mschatz commented 2 years ago

Hi Bjarne,

GenomeScope doesnt handle these very high coverage datasets very well and gets confused about what are errors and what are the kmers from the genome. Can you try downsampling so the main peak is at about 50x coverage? I also noticed your kmer histogram is truncated at 10,000x which will skip high frequency kmers and probably underestimate the genome size. Can you try boosting the max kmer coverage (you will need to rerun the histo step of jellyfish)

Good luck!

Mike

On Mon, Mar 14, 2022 at 4:56 PM Bjarne Bartlett @.***> wrote:

Hello,

I ran Genomescope on a new plant genome, however, the model did not converge even with a distinct peak present. Is there any solution for this?

The full results are here: http://genomescope.org/analysis.php?code=ARAt98SyZaNnWCd8WubG

[image: plot log] https://user-images.githubusercontent.com/31433034/158259472-d775eca3-7d69-45f6-a3f1-2f2b0483cdf3.png

jellyfish.kmers.25.asm.fa.histo.txt https://github.com/schatzlab/genomescope/files/8248444/jellyfish.kmers.25.asm.fa.histo.txt

— Reply to this email directly, view it on GitHub https://github.com/schatzlab/genomescope/issues/73, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABP343RNNNKYVAYLISPI6TU76RYPANCNFSM5QWW7SNQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

kanishka12191 commented 1 year ago

Hi,

I ran Genomescope on an insect genome, however, the model did not converge even with a distinct peak present. Could you please give me any suggestions to recover this matter?

I used the following codes to get the .histo file

jellyfish count -m 21 -s 100M -t 10 -C <(zcat 1P_CFM_S70_L004_R1_001.fastq.gz) <( 1P_CFM_S70_L004_R2_001.fastq.gz)

jellyfish histo -t 10 --high=1000000 mer_counts.jf > reads.histo

The full results are here:http://genomescope.org/analysis.php?code=VRogN5hqRd0metbGFxKc

Kmer length 21 Read length 100 and Max kmer coverage 10000

mschatz commented 1 year ago

The main peaks are still over 100x. You will need to downsample the reads and rerun jellyfish. The easiest way to do this is to skip the R2 reads, but you might need to go to even lower coverage. For this you can use 'wc -l .fq' to count the number of reads in the file, divide this number in half, and then run 'head -HALF .fq'. For example if the R1 file has 1,000,000 lines you could run 'head -500000 r1.fq > r1.sample.fq'

Hope this helps

Mike

On Fri, Jan 13, 2023 at 1:01 AM kanishka12191 @.***> wrote:

Hi,

I ran Genomescope on an insect genome, however, the model did not converge even with a distinct peak present. Could you please give me any suggestions to recover this matter?

I used the following codes to get the .histo file

jellyfish count -m 21 -s 100M -t 10 -C <(zcat 1P_CFM_S70_L004_R1_001.fastq.gz) <( 1P_CFM_S70_L004_R2_001.fastq.gz)

jellyfish histo -t 10 --high=1000000 mer_counts.jf > reads.histo

The full results are here: http://genomescope.org/analysis.php?code=QmBNSYd0ZQGXJJDas8yI

Kmer length 21 Read length 100 and Max kmer coverage 10000

— Reply to this email directly, view it on GitHub https://github.com/schatzlab/genomescope/issues/73#issuecomment-1381351346, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABP346QF3ID4VLDSFOEAN3WSDVS5ANCNFSM5QWW7SNQ . You are receiving this because you commented.Message ID: @.***>