evotools / hapbin

Efficient program for calculating Extended Haplotype Homozygosity (EHH) and Integrated Haplotype Score (iHS)
GNU General Public License v3.0
41 stars 18 forks source link

Provide un normalized XPEHH #27

Open hardingnj opened 8 years ago

hardingnj commented 8 years ago

From Sabeti 2007:

We normalize the XP-EHH logratio such that the set of all such logratios has zero mean and unit variance. We denote these normalized XP-EHH logratios by " XP-EHH scores"

Would it be possible to output both? If you split your analysis by contig/chromosome, it's helpful to perform the normalization over all contigs, not by contig.

The new iHS format #25 makes this easy for iHS, so thanks.

camaclean commented 8 years ago

Right now, they're actually un-normalized. I should also output the normalized scores, but it looks like I forgot to get around to it.

hardingnj commented 8 years ago

Thanks. Maybe what precisely is being calculated could be made slightly clearer in the docs? Thanks for the tool + maintenance btw, great work.

camaclean commented 8 years ago

Yeah, I'll get that updated. Thanks.

hardingnj commented 8 years ago

Thinking about. re: iHS

for my use case it would be handy to add an allele freq column for the data, so users could bin and calculate standardized iHS themselves having combined several tables. Otherwise I have to go back and look for the AF of those SNPs I have iHS values for. Just a suggestion though!

On 24/03/16 14:16, camaclean wrote:

Yeah, I'll get that updated. Thanks.

— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/evotools/hapbin/issues/27#issuecomment-200856665

Nicholas J Harding Principal Bioinformatician Global Health and Malaria Programme Kwiatkowski Group The Wellcome Trust Centre for Human Genetics Roosevelt Drive Oxford OX3 7BN United Kingdom +44 1865 287712

camaclean commented 5 years ago

Sorry! I had done this but for some reason I lost track of things and forgot to push the XPEHH standardization. I'll add in frequency columns, too.