tallulandrews / M3Drop

29 stars 9 forks source link

Concerning p-value histogram returned by NBumiFeatureSelectionCombinedDrop #13

Open sinanassiri opened 3 years ago

sinanassiri commented 3 years ago

Dear M3Drop team,

Thanks for this great package!

I have noticed weird-looking histograms for unadjusted p-values returned by NBumiFeatureSelectionCombinedDrop. Should we not expect these p-values to follow a random uniform distribution? Please note that I set ntop to retain all genes tested (i.e. all genes that passed NBumiConvertData's filtering), and the QC plots returned by NBumiCheckFit looked fine to me. I'm attaching example plots for one sample, but could reproduce this in multiple samples and datasets.

Looking forward to hearing your thoughts on this.

Best, Sina

Screenshot 2021-04-14 at 23 33 04 Screenshot 2021-04-15 at 00 28 42 Screenshot 2021-04-15 at 00 28 50

tallulandrews commented 1 year ago

The package uses a one-sided test for High Dropout Genes, this means that p-values close to 1 are extreme low dropout outliers, and p-values close to 0.5 are "normal" looking genes. There is a spike around 0.5 because of the difference in power between genes. Lowly expressed genes generally have lower power to detect so will gather around p = 0.5