percyfal / ratatosk

Apache License 2.0
17 stars 6 forks source link

Control number of threads / worker #39

Open percyfal opened 11 years ago

percyfal commented 11 years ago

How control the number of workers/threads in use? An example best explains the issue: alignment with bwa aln can be done with multiple threads. bwa sampe is single-threaded, and uses ~5.4GB RAM for the human genome. Our current compute cluster has 8-core 24GB RAM nodes. One solution would be to run 8 samples per node, running bwa aln -t 8 sequentially, wrap them with a WrappedTask before proceeding with bwa sampe, which then should only use 4 workers simultaneously. For small samples this is ok. For large samples, one might imagine partitioning the pipeline into an alignment step, in which one sample is run per node, and then grouping the remaining tasks and samples in reasonably sized groups. This latter approach would probably benefit from SLURM/drmaa integration (see following item).