Question about training/running on several GPU

atomistic-machine-learning / schnetpack

SchNetPack - Deep Neural Networks for Atomistic Systems

Other

791 stars 215 forks source link

Question about training/running on several GPU #647

Closed carlosbornes closed 4 months ago

carlosbornes commented 4 months ago

Dear devs,

Have you benchmarked the behaviour of training a model/running an MD in parallel on GPUs? I found some discussion here but the discussion seems to be very mixed or from old issues mentioning that in the future Schnet should run in parallel with Pytorch Lightning.

I'm currently writing an HPC application and would like to reference something related to the ability (or inability) of Schnetpack to train and run in parallel on a GPU

Best Carlos

jnsLs commented 4 months ago

Dear Carlos,

we have not benchmarked the performance of SchNetPack running on several GPUs. Nevertheless, you can run SchNetPack on multiple GPUs in parallel by simply adapting some arguments of the Pytorch-Lightning Trainer.

Best, Jonas

carlosbornes commented 4 months ago

Thanks, I will try to do it @jnsLs

SyntaxSmith commented 3 months ago

Dear Carlos,

we have not benchmarked the performance of SchNetPack running on several GPUs. Nevertheless, you can run SchNetPack on multiple GPUs in parallel by simply adapting some arguments of the Pytorch-Lightning Trainer.

Best, Jonas

Inference on multi GPUs with lammps is not usable when I want to run a large system which single GPU will be OOM

SyntaxSmith commented 3 months ago

Dear Carlos, we have not benchmarked the performance of SchNetPack running on several GPUs. Nevertheless, you can run SchNetPack on multiple GPUs in parallel by simply adapting some arguments of the Pytorch-Lightning Trainer. Best, Jonas

Inference on multi GPUs with lammps is not usable when I want to run a large system which single GPU will be OOM

If there any method to overwhelming this？

jnsLs commented 3 months ago

Thats right at the current status our LAMMPS interface does not support the use of multiple GPUs in parallel. However, the training does. If you want to run MD on multiple GPUs you could use the SchNetPackMD package. Slight adaptations in the code might be needed to make it run on multiple GPUs.

SyntaxSmith commented 3 months ago

Thats right at the current status our LAMMPS interface does not support the use of multiple GPUs in parallel. However, the training does. If you want to run MD on multiple GPUs you could use the SchNetPackMD package. Slight adaptations in the code might be needed to make it run on multiple GPUs.

OK,thanks