Closed OlgaGKononova closed 11 months ago
Dear @OlgaGKononova thanks for reporting. This is not expected behavior:) We will look into it.
Dear @isayev thank you for the reply. Do you have an estimate, how long it may take from your side to fix it. We would like to plan if it is worth for us waiting for the fixes or proceed with the current version.
Also, if I had amount of smiles sufficient enough to fit one GPU, would it automatically occupy other GPU with the rest of the smiles or it will still wait for the one used GPU to finish? In other words, is the code able to parallelize over multiple GPUs?
Thank you.
Dear @OlgaGKononova,
Thank you for your follow-up.
I would recommend proceeding with the latest version (2.1.0). I was unable to reproduce this issue on our local cluster with 3 GPUs, so it may be a challenging problem related to hardware. It will likely take some time to pinpoint the exact cause.
Based on your screenshot, it appears that no memory or computing resources were consumed for GPU2, 3, and 4. Therefore, this issue probably won't affect any existing processes on those GPUs.
Currently, Auto3D does not support parallelization across multiple GPUs. It utilizes a single GPU and divides the input SMI files into smaller jobs, running each one concurrently. You can specify the GPU to be used using the gpu_idx
argument:
python Auto3D_pkg/auto3D.py tests/input_corrected.smi --k=5 --enumerate_isomer false --capacity 1 --gpu_idx=4
In the above case, GPU at index 4 will be used. This code does the same job as your previous code, except using GPU at index 4. By default, tautomers are not enumerated, so I deleted --enumerate_tautomer false
for sanity.
Thank you for the update!
That is surprising, since the value of gpu_idx directly goes to torch.device("cuda: {gpu_idx}"). To check if it's Auto3D issue or related to hardware, could you try to run any general PyTorch script and see if you could control which GPU to be used?
One workaround it to append the environment variable of CUDA_VISIBLE_DEVICES before your command: CUDA_VISIBLE_DEVICES="your_gpu_idx" python Auto3D_pkg/auto3D.py tests/input_corrected.smi --k=5 --enumerate_isomer false
On Thu, Oct 5, 2023 at 5:10 AM Olga Kononova @.***> wrote:
UPD: I also found that somehow --gpu_idx flag is ignored: no matter that I put in there, the computations run gpu_idx=0
— Reply to this email directly, view it on GitHub https://github.com/isayevlab/Auto3D_pkg/issues/48#issuecomment-1748446547, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOK6RLOGPKUE4IDNTISNKATX5Z2P5AVCNFSM6AAAAAA4XVNWK6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONBYGQ2DMNJUG4 . You are receiving this because you commented.Message ID: @.***>
"Where there is charity and wisdom, there is neither fear nor ignorance."
Zhen (Jack) Liu Ph.D. candidate at the Isayev Lab Department of Chemistry, Carnegie Mellon University 4400 Fifth Avenue Pittsburgh, PA 15213 USA
Hello @OlgaGKononova , auto3d now supports running jobs with multiple GPUs. To use multiple GPUs, you just need to parse the gpu indexes to the gpu_idx
parameter as a comma seperated string. For example --gpu_idx=0,1,2
will use GPU at indexes 0, 1 and 2.
I will close it for now. Please let us know if you have additional questions.
Hi there!
I am running Auto3D on 200 smiles with
--use_gpu
flag beingTrue
and I found that it blocks all the available GPUs I have on the machine, but runs calculations only on one of them:For the run:
There is the following output in the log:
And
nvidia-smil
output (see processescocoa_env/bin/python
):So, it only considers one GPU, but blocks 5. Is this is an intendet behavior? Is the code able to parallelize over multiple GPUs? I tried turning
--capacity
option, but seems to be the same result.