dozmorovlab / HiCcompare

Joint normalization of two Hi-C matrices, visualization and detection of differential chromatin interactions. See multiHiCcompare for the analysis of multiple Hi-C matrices
https://dozmorovlab.github.io/HiCcompare/
Other
18 stars 3 forks source link

Different results depending on matrix input order #15

Closed StephenRicher closed 4 years ago

StephenRicher commented 4 years ago

Hi,

It looks as though the order of input of the two HiC matrices influences the results?

For example, in the example data there are a different number of significant (p < 0.05) interactions if you compare HMEC.chr22 vs. NHEK.chr22 compared to when you compare NHEK.chr22 vs. HMEC.chr22.

I would have expected the results to be symmetrical so I was wondering if this was expected and if so why does it occur?

Thanks, Stephen

image

mdozmorov commented 4 years ago

Thanks, @StephenRicher, for noticing. If you change the order, lowess fit will be slightly different, so some differential points may fluctuate around the level of significance. Here is the code that looks more into it. You will see from the last output that points missing in the second comparison are marginally significant.

library(HiCcompare)
data('HMEC.chr22')
data('NHEK.chr22')
hic.table <- create.hic.table(HMEC.chr22, NHEK.chr22, chr = 'chr22')
# Plug hic.table into hic_loess()
result <- hic_loess(hic.table, Plot = TRUE)
# perform difference detection
diff.result <- hic_compare(result, Plot = TRUE)
sum(diff.result$p.adj < 0.05)
res1 <- diff.result
rownames(res1) <- apply( res1[, 1:6] , 1 , paste , collapse = "-" )

hic.table <- create.hic.table(NHEK.chr22, HMEC.chr22, chr = 'chr22')
# Plug hic.table into hic_loess()
result <- hic_loess(hic.table, Plot = TRUE)
# perform difference detection
diff.result <- hic_compare(result, Plot = TRUE)
sum(diff.result$p.adj < 0.05)
res2 <- diff.result
rownames(res2) <- apply( res2[, 1:6] , 1 , paste , collapse = "-" )

diffIDs <- setdiff(rownames(res1)[res1$p.adj < 0.05], rownames(res2)[res2$p.adj < 0.05])

res1[rownames(res1) %in% diffIDs, ]
res2[rownames(res2) %in% diffIDs, ]