This PR greatly extends tailed_parameters.py to include the one-sample Z-test of proportions to identify parameters which may be overrepresented in high RMSD/TFD regions. Error bars are computed as 95% confidence intervals from the Z-tests. The null hypothesis is that the high_tfd representation ratio is the same as the whole_set representation ratio, and a two-tailed test is applied to determine whether to reject the null hypothesis.
This PR greatly extends
tailed_parameters.py
to include the one-sample Z-test of proportions to identify parameters which may be overrepresented in high RMSD/TFD regions. Error bars are computed as 95% confidence intervals from the Z-tests. The null hypothesis is that thehigh_tfd representation ratio
is the same as thewhole_set representation ratio
, and a two-tailed test is applied to determine whether to reject the null hypothesis.