aertslab / GENIE3

GENIE3 (GEne Network Inference with Ensemble of trees) R-package
26 stars 9 forks source link

Whether to normalize the input matrix to GENIE3? #7

Closed mt1022 closed 5 years ago

mt1022 commented 5 years ago

Dear GENIE3 developers,

I was confused about whether normalization of input matrix to GENIE3 is required. I read in the vignette that

Note that the expression data do not need to be normalised in any way.

However, I got quite results with normalized and un-normalized data. Here is an example.

library(GENIE3)
set.seed(123)
# suppose this below is normalized value
exprMatr <- matrix(sample(1:10, 100, replace=TRUE), nrow=20)
rownames(exprMatr) <- paste("Gene", 1:20, sep="")
colnames(exprMatr) <- paste("Sample", 1:5, sep="")
head(exprMatr)

set.seed(123)
weightMat <- GENIE3(exprMatr, nCores=1, verbose=TRUE, nTrees = 1000)
weightMat[1:3, 1:3]
#             Gene1     Gene10     Gene11
# Gene1  0.00000000 0.06764472 0.14411471
# Gene10 0.02555183 0.00000000 0.02976389
# Gene11 0.02578523 0.03702720 0.00000000

set.seed(123)
# simulate an unnormalized matrix
exprMatr[, 3] <- exprMatr[, 3] * 2
exprMatr[, 5] <- exprMatr[, 5] * 3
exprMatr[, 1] <- exprMatr[, 1] * 4
weightMat <- GENIE3(exprMatr, nCores=1, verbose=TRUE, nTrees = 1000)
weightMat[1:3, 1:3]

#             Gene1     Gene10     Gene11
# Gene1  0.00000000 0.15503584 0.05509333
# Gene10 0.02418188 0.00000000 0.04430111
# Gene11 0.07509068 0.02974492 0.00000000

Does this mean normalization of input expression matrix do affect the final results?

s-aibar commented 5 years ago

Hello, Indeed there is no specific recommendation regarding the normalization of the expression matrix, but whether the matrix is normalized/log-transormed, etc... will affect the final results. I have clarified this in the vignette. Thank you!