mllg / batchtools

Tools for computation on batch systems
https://mllg.github.io/batchtools/
GNU Lesser General Public License v3.0
172 stars 51 forks source link

Speedup reduceResultsDataTable with many (millions) of Jobs #231

Open karchjd opened 5 years ago

karchjd commented 5 years ago

I am using batchtools to run a simulation study for which I have a lot (18 million) of very short jobs. I chunk those jobs into 100 jobs, which makes them run quite fast on our cluster (around 24hs for all jobs). The bottleneck now seems to be gathering all results, which I do using the following code

reg <- loadRegistry('registry')
extractF <- function(x){
  if(class(x)=='htest'){
    pval <- x$p.value
  }else{
    pval <- x$WTS[4]
  }
  return(list(pval=pval))
}
resultsRaw <- unwrap(reduceResultsDataTable(fun=extractF))

and disabled the progress bar.

I ran a smaller version of the same experiment with 1 million jobs. Gathering the results took like one hour using one of the nodes of our cluster. So, my estimate is that it will take 18 hours to gather the results of the 18 million jobs. I tried moving the registry to my local SSD but this seemed slower.

I have two result types, depending on which algorithm is used. They are around 370 and 470 bytes big.

My guess is that it is so slow because there are so many small results file. So, I would predict that saving all results of one chunk in one file, would make gathering the results substantially faster. Is this possible?

jakob-r commented 5 years ago

Related discussion: https://github.com/mllg/batchtools/issues/222

karchjd commented 5 years ago

@jakob-r thanks for pointing me to this indeed very relevant discussion. Seems the best approach, for now, is to either use another package (for example, clustermq) or manually chunk many jobs. Adding a chunking option which also chunks the result files seems like a worthwhile feature.

mllg commented 5 years ago

Adding a chunking option which also chunks the result files seems like a worthwhile feature.

Yes, this is on my todo, but I'm terribly busy with other projects at the moment.