aertslab / GENIE3

GENIE3 (GEne Network Inference with Ensemble of trees) R-package
26 stars 10 forks source link

Estimated runtime? #1

Closed davetang closed 7 years ago

davetang commented 7 years ago

I'm currently running the R version of GENIE3 and was wondering how long it takes to complete for a dataset with around 8,000 single cells and 12,000 genes. It's been running for around 8 hours on 32 cores.

s-aibar commented 7 years ago

Hello,

I cannot give you a specific running time, as it depends on the specific computer/setup (e.g. speed and memory also matter...). But as reference, when we have run it on datasets of 1-3k cells it typically takes a few hours. However, the running time for bigger datasets (e.g. >5k cells) often increases to few days (2-5days, with 24 cores most of the analyses finished within a week).

In case it helps: On really big datasets, I normally split the gene list into several subsets (this helps estimating the time that it is going to take to finish, and also saves intermediate results in case something goes wrong and crashes...). Example:

# Run on subsets of genes 
# (dividing the original gene list into 10 pieces)
library(GENIE3)
genesSplit <- split(sort(rownames(exprMatrix_filtered)), 1:10)
lenghts(genesSplit)

for(i in 1:length(genesSplit))
{
  print(i)
  set.seed(93827)
  weightMatrix <- GENIE3(exprMatrix_filtered, regulators=inputTFs, nCores=24, targets=genesSplit[[i]])
  save(weightMatrix, file=paste0("GENIE3_weightMatrix_",i,".RData"))
}

# Merge results:
library(GENIE3)
linkList_list <- list()
for(i in 1:10)
{
  load(paste0("int/1.3_GENIE3_weightMatrix_",i,".RData"))
  linkList_list[[i]] <- getLinkList(weightMatrix)
}
length(linkList_list)
sapply(linkList_list, nrow)

linkList <- do.call(rbind, linkList_list)
colnames(linkList) <- c("TF", "Target", "weight")
linkList <- linkList[order(linkList[,"weight"], decreasing=TRUE),]
linkList <- linkList[which(linkList[,"weight"]>0),]
nrow(linkList)
head(linkList)
save(linkList, file="GENIE3_linkList.RData")
davetang commented 7 years ago

Brilliant, thanks!