how to make RedDog fast?

MostafaYA commented 5 years ago

Hi, could be any tips to run the RedDog pipeline fast? takes longtime even on only 2 samples. Do I understand it correctly that the processing of each sample include only one cpu, not more even if more cpus are available? do you recommend parallelization using "parallel" command?

Thanks

quocviet0908 commented 5 years ago

Hi there, From what I've read in the manual, you should only run 1 sample/1 pipeline at the given time. You can increase the number of CPU to boost the performance (I think) by editing the config file, the manual has shown that already.

d-j-e commented 5 years ago

Hi MostafaYA,

You don't say what system you are running on - reddog is designed to run on a distributed system, where lots of jobs are sent out. And yes, we make do with one core per read set (sample) as many of our data sets are hundreds or thousands in number. You could tinker with the commands in the config file to use more than one core for certain steps, though you would also have to add the extra cpus to the command as well... A lot of the shorter steps don't really need to be parallelised. Happy to help if you want to give it a try, though I am no 'parallel' expert.

katholt / RedDog

how to make RedDog fast? #64