Closed malvaradol closed 6 months ago
I also forgot to mention that the size of the genome estimation for both graphs is very different to what's been reported. The genome size reported spans between 925Mbp and 1.05Gb as this organism has multiple cytotypes.
There is something unusual about your histogram file - all of the odd kmer counts have a value of zero:
$ head reads.histo.txt
1 0 2 273466582 3 0 4 101884369 5 0 6 119051792 7 0 8 135336019 9 0 10 142023371 11 0 12 138532293 13 0 14 125153138 15 0 16 106433147 17 0 18 85885224 19 0 20 68523317 21 0 22 56314167 23 0 24 50562698 25 0 26 50043002 27 0 28 52767577 29 0 30 56574373
Im guessing there was a technical error such as your reads file was accidentally duplicated so that every read occurs twice - this will take kmers that should occur once and make them occur twice, kmers that should occur twice and make them occur 4 times, etc. Can you double check your input files?
Good luck!
Mike
On Thu, Apr 18, 2024 at 11:45 AM Mateo Alvarado @.***> wrote:
I also forgot to mention that the size of the genome estimation for both graphs is very different to what's been reported. The genome size reported spans between 925Mbp and 1.05Gb as this organism has multiple cytotypes.
— Reply to this email directly, view it on GitHub https://github.com/schatzlab/genomescope/issues/127#issuecomment-2064301336, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABP345G5ONB54G6BRRDRVLY57TABAVCNFSM6AAAAABGNRC4E6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANRUGMYDCMZTGY . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Hi Mike,
You are definitely right! I checked other issues to see what could have been happening, and I had the same problem where transforming a .bam file to .fastq duplicated my sequences for no reason. As suggested by the person in the issue, transforming the files with pbtk did the job.
Thanks for your response.
Hi there,
I was wondering if I could get some help with figuring out what is going on here. I'm running genomescope on a dataset of HiFi reads of ~60X coverage, here's the script I'm using:
I already tried to change the -cx value from 10.000 to
-cx100000
, but it did not provide any changes. Here's the plot I get:Also, I tried with 21 kmers, here's the plot:
Here are the .histo files if they can be of any help:
reads.histo.txt reads_21.histo.txt
Appreciate any help!!