ggloor / ALDEx2_dev

ALDEx tool to examine compositional high-throughput sequence data with Welch's t-test
GNU Affero General Public License v3.0
12 stars 6 forks source link

Abnormal raw p-value distribution from aldex.ttest #40

Closed ijhoskins closed 1 year ago

ijhoskins commented 1 year ago

Hello,

I have been looking at some diagnostic plots of test results and noticed a bimodal raw p-value histogram. aldex_example_pval_distr_gg.pdf

For comparison, here is a p-value histogram from a limma analysis, which produces an anti-conservative histogram. limma_example_pval_distr_gg.pdf

http://varianceexplained.org/statistics/interpreting-pvalue-histogram/

I am wondering if such a distribution might be expected given the testing strategy. I tried using a IQLR denominator as well as iterate=TRUE, and I see similar results with different parameters.

Thanks for any help you might be able to provide!

ggloor commented 1 year ago

Hi Ian The P value distribution you see for ALDEx2 is because it is not a p-value, but a posterior p-value (http://www.stat.columbia.edu/~gelman/research/unpublished/ppc_understand2.pdf) which tend towards a value of 0.5 with infinite numbers of dir MC instances. I am updating the documentation to make this clear as it is a common misconception

ijhoskins commented 1 year ago

Hi @ggloor thank you for your input, that makes sense!