Although I specify `--workers`, mummer/nucmer is still running on the maximum number of cores. How can I limit the number of cores for nucmer?

widdowquinn / pyani

Application and Python module for average nucleotide identity analyses of microbes.

http://widdowquinn.github.io/pyani/

MIT License

192 stars 55 forks source link

Although I specify `--workers`, mummer/nucmer is still running on the maximum number of cores. How can I limit the number of cores for nucmer? #122

Open widdowquinn opened 5 years ago

widdowquinn commented 5 years ago

Originally posted by @sarah872 in https://github.com/widdowquinn/pyani/issues/20#issuecomment-440969428

widdowquinn commented 5 years ago

Thanks for the report, Sarah. I can't seem to reproduce the behaviour you see. For me, using:

pyani v0.2.7 (the current master branch)
macOS 10.14.1 (Mojave)
mummer v3.1
python 3.6.6

and executing the command described in #20:

average_nucleotide_identity.py -v -f \
  -i tests/test_ani_data/ \
  -o tests/test_ANIm_output/ \
  -g --gformat png,pdf,eps \
  --classes tests/test_ani_data/classes.tab --labels tests/test_ani_data/labels.tab \
  --workers 1

I see only one mummer process.

With the alternative command:

average_nucleotide_identity.py -v -f \
  -i tests/test_ani_data/ \
  -o tests/test_ANIm_output/ \
  -g --gformat png,pdf,eps \
  --classes tests/test_ani_data/classes.tab --labels tests/test_ani_data/labels.tab \
  --workers 4

I see four mummer processes.

For me, --workers seems to work as expected.

Can you please describe your system and how you're running the command?

sarah872 commented 5 years ago

I am using pyani 0.2.7 on CentOS Linux 7 (Core) with mummer 4.0.0beta2.

This is the command I am running: average_nucleotide_identity.py -i test/ -o out --workers 3 There are 16 nucleotide sequences in test. Although I expect mummer to be run on 3 cores, it uses the maximum of available cores.

widdowquinn commented 5 years ago

Hi Sarah,

The --workers option governs the number of concurrent mummer jobs, which only indirectly determines the number of cores that mummer uses. mummer 3 is single-threaded, but mummer 4 is multithreaded.

As a user, on your multicore machine would you prefer to run a single mummer job using all available cores, or multiple single-threaded mummer jobs?

sarah872 commented 5 years ago

I see! I will try it with mummer 3 then. For me, specifying a number of cores is something that makes a program flexible for running it on e.g. a cluster., and therefore essential for my working environment. My preferred option would be one job on multiple cores. Thanks!

widdowquinn commented 5 years ago

As a sensible change to pyani behaviour, I think I'll have to test for which version of mummer is present, then do one of the following:

restrict mummer 4 to single-threading
restrict each instance of mummer 4 to a specified number of workers/cores
let mummer do its own thing

For a cluster like our local cluster (which is a common setup), I'm not sure it's straightforward to tailor the total number of cores requested to suit those available on a node, but we can specify a minimum number of free cores for each single job (though arraying those jobs may add a layer of complexity). It's something I'll have to look into for the next (impending) version.

My feeling is that the ultimate control will be in the hands of the user, who will have to use pyani parameters to somehow balance:

number of cluster nodes
number of available cores on each node
number of cores used by each mummer process

to their best advantage. Right now I'm targeting either an SGE-like scheduler, or local multicore, as these are what I have available to me. Any advice on how better to manage this is welcome.

baileythegreen commented 2 years ago

@widdowquinn Please advise as to the relevancy of this issue in the current state of the repo. This may play in to some of the batching changes that need to be made, for instance.

widdowquinn commented 2 years ago

This is relevant to the current state of the repo, and does, as you note relate to how we move to a new SLURM-friendly batching approach.

The key issue here, I think, is how much control we provide the user over how jobs are distributed. This requires us to take into account how the underlying tool that is called distributes its jobs.

As described above, we could rely on mummer3 running a single job on a single thread/core per comparison. This appears not to be the case with mummer4, which appears to run the required number of simultaneous comparisons, but "spreads out" threadwise to use as much processing capability as is possible.

So, when we detect which version of mummer is in place on the user's machine, we need to be able to generate the appropriate command to generate the consistent behaviour we want.

If we want to enforce behaviour such that --workers controls the number of comparison jobs and makes this equal to the number of cores that are used (the current implementation), we will have to use distinct command-lines for mummer3 and mummer4 to manage this. If we want to put that control in the hands of the user, we'll have to modify the CLI to give the user that extra control.