chanzuckerberg / miniwdl

Workflow Description Language developer tools & local runner
MIT License
175 stars 54 forks source link

Scheduler's task_concurrency is not working (because it isn't in CPUs?) #705

Open adamnovak opened 3 months ago

adamnovak commented 3 months ago

I'm running a workflow like this:

MINIWDL__CALL_CACHE__GET=true MINIWDL__CALL_CACHE__PUT=true MINIWDL__CALL_CACHE__DIR=`pwd`/output/miniwdl-cache MINIWDL__SCHEDULER__TASK_CONCURRENCY=80 time miniwdl run --input ./inputs3.json --dir output/miniwdl-runs -o miniwdl-result.json https://raw.githubusercontent.com/vgteam/vg_wdl/lr-giraffe/workflows/giraffe_and_deepvariant.wdl

My machine has 160 cores, but I want to limit the workflow to 80 so that some cores are available for other users.

But my terminal shows:

    0:04:31 elapsed    ▬▬         tasks finished: 43, ready: 5, running: 10    reserved CPUs: 160, RAM: 1118GiB

Even though I set MINIWDL__SCHEDULER__TASK_CONCURRENCY to 80, the workflow is still using 160 cores across 10 running tasks.

Is MINIWDL__SCHEDULER__TASK_CONCURRENCY actually in raw tasks independent of how many CPUs they get? If so, why does it default to the number of cores on the machine? Is there not a way to limit a MiniWDL execution to, say, either 8 tasks of 10 cores each or 80 tasks of 1 core each by limiting the total cores used?

adamnovak commented 3 months ago

See also: https://github.com/chanzuckerberg/miniwdl/issues/244 where the feature was originally specced out.

mlin commented 3 months ago

@adamnovak on defaults, miniwdl delegates the container scheduling to Docker Swarm, so it doesn't directly control how many CPUs will be scheduled concurrently. Maybe there's a way to reconfigure Docker Swarm's scheduler to limit how many CPUs it'll occupy?

The [scheduler] task_concurrency setting indeed bounds the number of tasks submitted to Docker Swarm -- more specifically the size of the thread pool miniwdl uses to run drive tasks concurrently (one orchestration thread per task). I don't see a reasonable default value for that other than the number of CPUs, though I don't disagree that can promote misinterpretation. That setting ends up being used more in the cloud backends where it's going to limit how many concurrent tasks will be hitting the shared filesystem, for example.