SchulzLab / STITCHIT

Learning gene-specific regulatory elements from epigenetic data
11 stars 2 forks source link

P-value computation #1

Open quirinmanz opened 1 year ago

quirinmanz commented 1 year ago

Dear STITCHIT-Team,

In the Test for p-values using the vectors https://github.com/SchulzLab/STITCHIT/blob/641dc07c92f58d80ecc7c8f8b481dbc1b510cc58/test/corCompTest.cpp#L15 and https://github.com/SchulzLab/STITCHIT/blob/641dc07c92f58d80ecc7c8f8b481dbc1b510cc58/test/corCompTest.cpp#L20 the p-value is assumed to be 0.01866973 https://github.com/SchulzLab/STITCHIT/blob/641dc07c92f58d80ecc7c8f8b481dbc1b510cc58/test/corCompTest.cpp#L310-L314 . The corresponding R code:

> cor.test(c(1,2,3,4,5), c(1,2,4,3,5), alternative = "greater", method = "pearson")

    Pearson's product-moment correlation

data:  c(1, 2, 3, 4, 5) and c(1, 2, 4, 3, 5)
t = 3.5762, df = 3, p-value = 0.01869
alternative hypothesis: true correlation is greater than 0
95 percent confidence interval:
 0.2996475 1.0000000
sample estimates:
cor 
0.9 

> cor.test(c(1,2,3,4,5), c(1,2,4,3,5), alternative = "two.sided", method = "pearson")

    Pearson's product-moment correlation

data:  c(1, 2, 3, 4, 5) and c(1, 2, 4, 3, 5)
t = 3.5762, df = 3, p-value = 0.03739
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.08610194 0.99343752
sample estimates:
cor 
0.9

As one can see, the alternative hypothesis used in the computation in STITCHIT is one-sided ("greater"). https://github.com/SchulzLab/STITCHIT/blob/641dc07c92f58d80ecc7c8f8b481dbc1b510cc58/core/CorComp.cpp#L170 To also account for significant negative correlations, STITCHIT uses the absolute correlation for the Fisher's transformation: https://github.com/SchulzLab/STITCHIT/blob/641dc07c92f58d80ecc7c8f8b481dbc1b510cc58/core/CorComp.cpp#L141-L149

Does this violate the alternative hypothesis, which should be two-sided (corr != 0)?

Best, Quirin

quirinmanz commented 1 year ago

Dividing the p-value threshold by 2 in core/main.cpp could solve this issue, as far as I can judge.