Speeding up external tools by setting default number of CPUs?

ericmjl commented 2 years ago

Hi AF2 team!

I'm chiming in to see if you all might be open to modifying some of the Python wrappers to external tools such that they automatically use all CPUs available?

I did a quick audit of alphafold/data/tools/*.py, and found the following class definitions that use n_cpus in their init:

Instead of hard-coding the n_cpu argument, I was thinking it might be possible to automagically use all CPU cores by using:

import os
...
n_cpu = os.cpu_count()

The primary motivation here is to speed up MSA calculations, which are relatively time consuming.

Would this be something you'd be amenable to accepting as a PR? If so, I'm happy to make a PR here. If not, happy to close out this issue too!

BastianKalcher commented 2 years ago

hello ericmjl

I had a similar idea and wanted to increase the number of CPUs. Unfortunately the adjustment of the variable n_cpus has no effect on the command that is finally executed. I have also made similar adjustments to hhsearch.py https://github.com/deepmind/alphafold/blob/624a44966619218f546852863f0f9220fc9c2849/alphafold/data/tools/hmmsearch.py#L92 without success :/

Either I have a basic thinking error in my attempts to increase the CPU count, or the number of CPUs is still influenced by other factors.

But it's nice to know that there are others who think that 4 or 8 cores are a bit low ^^.

Update After I recreated the docker image, the changed variables were applied.

lucajovine commented 2 years ago

Update After I recreated the docker image, the changed variables were applied.

I confirm that this indeed does the job. After:

perl -p -i -e 's/n_cpu: int = \d+,/n_cpu: int = 15,/' alphafold/data/tools/hhblits.py
perl -p -i -e "s/'--cpu', '\d+'/'--cpu', '15'/" alphafold/data/tools/hmmsearch.py
perl -p -i -e "s/n_cpu: int = \d+/n_cpu: int = 15/" alphafold/data/tools/jackhmmer.py

followed by rebuilding of the docker image, the hhblits/hmmsearch/jackhmmer steps can use a larger number of CPUs (in this case, 15) than the default 4/8/8.

Running inferences on a few targets, I can't say that this saves an enormous amount of time (on average ~7%), but of course one could use more than 15 CPUs (and anyway, every little bit helps!).

@ericmjl @BastianKalcher I guess this issue can be closed?

ericmjl commented 2 years ago

@lucajovine thanks for confirming that the extra CPU cores help! I’m wondering if we might be able to keep the issue open to get a second opinion from the AF2 maintainers? I have a hunch that the code changes I’ve proposed will bring benefits for everyone who uses AF2’s source code, without needing to rely on an on-the-fly hard-coded patch, though I do want to get the maintainers’ thoughts before deciding.

lucajovine commented 2 years ago

@ericmjl absolutely! Ideally one would want something builtin, which - considering that one cannot split a single prediction over multiple GPUs - could be something like the following:

import multiprocessing
import subprocess

def af2_external_tool_cpus():
    cpus = multiprocessing.cpu_count()
    try:
        gpus = len(subprocess.check_output(['nvidia-smi','-L']).decode('utf-8').strip().split('\n'))
    except OSError:
        gpus = 0
    try:
        return int((cpus-2)/gpus) 
    except ZeroDivisionError:
        return "ERROR: No GPU detected"

This returns a CPU number per GPU that leaves a couple of processors free for other jobs and could be assigned to n_cpu/--cpu/n_cpu in hhblits.py/hmmsearch.py/jackhmmer.py ...

bernt-matthias commented 2 years ago

I have a similar problem. For our HPC system I have to specify how many CPUs are used for a job. If the job underutilises the requested CPUs our admins will kill the job.

Currently the tasks executed by alphafold use an inconsistent number of CPUs. So it would be nice if there would be a global --cpus X parameter that would apply to all tasks.

Maybe using the defaults as they are at the moment of the argument is not given and all CPUs if --cpus 0.

I guess withe the suggestions made in https://github.com/deepmind/alphafold/issues/527#issuecomment-1186494772 and https://github.com/deepmind/alphafold/issues/527#issuecomment-1186799272 it should be relatively easy to create a PR ...?

claczny commented 1 year ago

I second that suggestion.

One thing I am not sure about though is whether the auto-detect is necessary or would be a "stretch goal". But it for sure it seems that there is, for some tools, no runtime way of specifying the number of CPUs that should be used, which is ... unexpected IMHO.

So how about having a command-line parameter that first sets the CPUs to a specified value?

google-deepmind / alphafold

Speeding up external tools by setting default number of CPUs? #527