cggh / scikit-allel

A Python package for exploring and analysing genetic variation data
MIT License
287 stars 49 forks source link

p-value calculation feature request #318

Open raqueldias opened 4 years ago

raqueldias commented 4 years ago

I was wondering if it would be possible to add LD r p-values as part of the outputs from functions like allel.rogers_huff_r and allel.rogers_huff_r_between. Instead of returning only one array of r results, they could return 2 arrays, one with r values and another with LD r p-values. This information would be helpful to estimate how significant is the observed LD r result versus its null hypothesis distribution, given the input dataset size.

Thanks!

alimanfoo commented 4 years ago

Hi @raqueldias, thanks for raising, apologies for slow reply. Do you have a sense of how such a P-value should be calculated? I'm not a very good statistician I'm afraid, and I'm vaguely aware there are several different approaches to testing significance of a correlation.