Open xiekunwhy opened 8 months ago
Hi,
for diploid genome, findGSE(histo="Cfl.histo", sizek=21, outdir="hom_test_21mer62", exp_hom = 62), which following number is haploid genome size?
size_all 2831278311 size_exl 2762932750 size_cat 3063218222 size_fit 2276505453 size_cor2 4239285369 Het_rate 0.00913753 0.00913753 Est. ratio of repeats 0.88225222 Final k-mer cov 36.5624931
Best, Kun
Hi, can you share the pdf?
here is the pdf file v1.94.est.Cfl.histo.sizek21.curvefitted.pdf
here is the pdf file v1.94.est.Cfl.histo.sizek21.curvefitted.pdf
The current result seemed not correct. Can you reset exp_hom = 70, and rerun?
You can also share me the histo file, if that is okay.
Thank you for your reply.
Here is exp_hom = 70 results, v1.94.est.Cfl.histo.sizek21.curvefitted.pdf
and the histo file is here, Cfl.zip
Best, Kun
Thank you for your reply.
Here is exp_hom = 70 results, v1.94.est.Cfl.histo.sizek21.curvefitted.pdf
and the histo file is here, Cfl.zip
Best, Kun
The histogram look a bit "weird".
Do you know if the species is diploid or polyploid? I am asking because the hist has a peak at 15x, and another at 56x, and the tail of the hist is also with high y-values - repeats or resulting from higher ploidy.
To me, it does not look like a diploid, but more likely a tetraploid.
I can only tell the full genome size is around 10 Gb. The haploid genome size would be 10 Gb / n, where n is the ploidy which you need to figure out.
Another explanation could be, this is mixture of different DNA material - maybe there is contamination in DNA in sequencing.
Thank you for your help.
There are two state of this species, diploid and tetraploid.
Smudgeplot and karyotype analysis told me that the sample we are analysis is diploid. May be there is contamination in DNA in sequencing.
Here is Smudgeplot results smudgeplot_verbose_summary.txt
Best, Kun
Hi,
I also wanted to know how to interpret the results and which number is the "real" genome size. Here is the pdf file. findGSE-PSR.pdf
Thank you for your help.
Thank you for your help.
There are two state of this species, diploid and tetraploid.
Smudgeplot and karyotype analysis told me that the sample we are analysis is diploid. May be there is contamination in DNA in sequencing.
Here is Smudgeplot results smudgeplot_verbose_summary.txt
Best, Kun
I would not believe in k-mer estimation in ploidy, in this particular case, because the peak at 15x has been considered as errors - I do not know what method is underlying this determination.
You can try
Hi,
I also wanted to know how to interpret the results and which number is the "real" genome size. Here is the pdf file. findGSE-PSR.pdf
Thank you for your help.
This is a homozygous species, you do not need to sep up exp_hom. The last row gives the haploid genome size.
Hi,
I ran this command on a genome which I don't know the size and the ploidy level : findGSE(histo = "/Users/icesim/Downloads/21mer_no_cut-2.histo", sizek=21, outdir="/Users/icesim/Desktop/findGSE-teleau", exp_hom = 100) The result expected was around 8mb so does findGSE gives an estimation for the whole genome size or the haploid genome size ?
Thanks for your help, findGSE.pdf
Hi,
it gives haploid genome size estimation.
According to the k-mer coverage pattern, you may want to run it under homozygous mode.
Best, Hequan
On 16. Mar 2024, at 18:14, simleopold @.***> wrote:
Hi,
I ran this command on a genome which I don't know the size and the ploidy level : findGSE(histo = "/Users/icesim/Downloads/21mer_no_cut-2.histo", sizek=21, outdir="/Users/icesim/Desktop/findGSE-teleau", exp_hom = 100) The result expected was around 8mb so does findGSE gives an estimation for the whole genome size or the haploid genome size ?
Thanks for your help, findGSE.pdf https://github.com/schneebergerlab/findGSE/files/14623185/findGSE.pdf — Reply to this email directly, view it on GitHub https://github.com/schneebergerlab/findGSE/issues/11#issuecomment-2001938671, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFQGWWRL4RJBS7IDW474TLLYYQLQFAVCNFSM6AAAAABCOJQVRGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBRHEZTQNRXGE. You are receiving this because you commented.
Thank you for your quick answer,
I tried to run it under homozygous mode but I have the following error on R : "Error in singlestart:singleend : NA/NaN argument"
Does it mean I have no choice but to run it under heterozygous mode ?
Can you show me the cmd?
On 16. Mar 2024, at 19:03, simleopold @.***> wrote:
Thank you for your quick answer,
I tried to run it under homozygous mode but I have the following error on R : "Error in singlestart:singleend : NA/NaN argument"
Does it mean I have no choice but to run it under heterozygous mode ?
— Reply to this email directly, view it on GitHub https://github.com/schneebergerlab/findGSE/issues/11#issuecomment-2001949841, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFQGWWRYTF45JQ2D5XRETR3YYQRHFAVCNFSM6AAAAABCOJQVRGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBRHE2DSOBUGE. You are receiving this because you commented.
I ran this command : findGSE(histo = "/Users/icesim/Downloads/21mer_no_cut-2.histo", sizek=21, outdir="/Users/icesim/Desktop/findGSE-teleau")
Hi,
for diploid genome, findGSE(histo="Cfl.histo", sizek=21, outdir="hom_test_21mer62", exp_hom = 62), which following number is haploid genome size?
size_all 2831278311 size_exl 2762932750 size_cat 3063218222 size_fit 2276505453 size_cor2 4239285369 Het_rate 0.00913753 0.00913753 Est. ratio of repeats 0.88225222 Final k-mer cov 36.5624931
Best, Kun