gmarcais / Jellyfish

A fast multi-threaded k-mer counter
Other
462 stars 136 forks source link

estimate heterozygosity from hist #127

Open dcopetti opened 6 years ago

dcopetti commented 6 years ago

hello, I wonder if it is possible estimate the heterozygosity present in my sequencing reads from a distribution like this: image

It is from raw Illumina reads of a heterozygous genome, and it clearly looks that many regions are allele-specific (peak at 37) and less are shared between alleles. Knowing the genome size could be good too (I have it from the SOAPdenovo tool), I would be interested to know if there is a way to quantify how much heterozygosity my sample has. This data has generated an assembly that is mostly diploid, so mapping back reads will not show SNPs. Thanks,

Dario