shenlab-sinai / GeneOverlap

R package for testing and visualizing gene list overlaps
https://bioconductor.org/packages/release/bioc/html/GeneOverlap.html
18 stars 2 forks source link

The contingency table you are creating is wrong! #2

Closed ShaowenJ closed 4 years ago

llrs commented 4 years ago

@Lesdormis Could you show how the contingency table is wrong?

ShaowenJ commented 4 years ago

So the table this package created is for example:

Contingency Table:

      notA inA

notB 240 215 inB 3 42

But the correct table to send into the fisher.test in R is hitInSample. sampleSize-hitInSample hitInPop-hitInSample failInPop-sampleSize+hitInSample

http://mengnote.blogspot.com/2012/12/calculate-correct-hypergeometric-p.html

Please correct me if I am wrong

llrs commented 4 years ago

I don't know where do these numbers come from, but let's check. In A you have in total 215+42=257 elements and in B 3+42=45, in total you have 240+215+3+42=500 elements.

Could you please post your input data and check if this is correct? Compared with the link provided this matrix has the rownames and the colnames swapped, maybe this is what you noticed as "wrong", or do you get different values?

Last, if you use the fisher.test note that the disposition of the values on the table do not alter the result:

m <- matrix(c(240, 215, 3, 42), nrow = 2, ncol = 2)
m2 <- matrix(c(240, 215, 3, 42), nrow = 2, ncol = 2, byrow = TRUE)
m
m2
fisher.test(m)
fisher.test(m2)
ShaowenJ commented 4 years ago

Hi @llrs Thanks very much for helping me here. I just did a more careful check. And you are right. Their calculation is correct, and I use the hypergeometric test as a comparison.

Sorry for making this conclusion so carelessly. I will close this issue.