evotools / hapbin

Efficient program for calculating Extended Haplotype Homozygosity (EHH) and Integrated Haplotype Score (iHS)
GNU General Public License v3.0
42 stars 17 forks source link

XPEHH output #46

Closed paolo002 closed 6 years ago

paolo002 commented 6 years ago

Hi I have obtained an output of the XPEHH but there are only two columns. One, for the SNP IDs and the second the XPEHH values. First of all, is this normal? Because in the manual it is indicated that 5 columns will be present in the output. If yes, may I know if the values should be considered as hap popA compared to hap popB or vice versa? Thanks

camaclean commented 6 years ago

I double checked the source files. It should be:

Location    iHH_A1    iHH_B1    iHH_P1    XPEHH

XPEHH = log(A1/B1)

This reminds me that I need to finish implementing standardizing the xpehh values. Right now, you have to do that manually.

paolo002 commented 6 years ago

Thank you very much. I understand. I think the output of the iHS is giving me two files, one with values before normalisation and the other with values after it (*.std) while the XPEHH is giving only one file. In both cases the files have only two columns. Then, how to normalise the XPEEH values? And, if XPEHH=log(A1/B1), does it mean that a positive value equals to selection on population A while negative value equals to selection on population B? Am I correct? By the way, congratulations for developing hapbin, I compared it with selscan using the same input dataset and the iHS results are as good as selscan but it saved me a lot of time because selscan is extremely slow while hapbin can give the results thousands time faster for both iHS ands XPEHH.

paolo002 commented 6 years ago

hi, please, can you let me know how to normalise the XPEEH scores? Does it need the info of the HH_A1 and iHH_B1?, because it is not in the output.

prenderj commented 6 years ago

Hi

Yes if have a high (standardised) XP-EHH value then suggests that EHH decays more slowly in population A suggesting its haplotypes have potentially been under stronger positive selection.

The normalisation procedure is the same as for iHS i.e. for each XP-EHH value you determine how many standard deviations from the mean of other values it is (https://en.wikipedia.org/wiki/Standard_score). Hopefully Colin can implement this.

Best Wishes James

On Mon, 17 Sep 2018 at 09:41, paolo002 notifications@github.com wrote:

hi, please, can you let me know how to normalise the XPEEH scores?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/evotools/hapbin/issues/46#issuecomment-421929751, or mute the thread https://github.com/notifications/unsubscribe-auth/AHrsGvZPBxrBIWaLGTofv9UxC91Afe__ks5ub2BRgaJpZM4WnAYQ .

paolo002 commented 6 years ago

Thanks a lot for the explanation.

camaclean commented 6 years ago

I just realized I'd already implemented it, but forgot to push the addition. Pull #47 does it.

paolo002 commented 6 years ago

Thank you very much! Really appreciated.

camaclean commented 6 years ago

Wait for pull #48 before using the non-MPI version. I have a bug in #47 when calculating frequency.

paolo002 commented 6 years ago

Sure, thanks a lot Colin

paolo002 commented 6 years ago

Hi , I am sorry, I forgot to ask also some explanation about the iHS instead. For the Hapbin iHS output, a positive standardised value of iHS equals to selection on the reference allele while a negative value is equal to selection on alternate allele? Is that correct?