Multi-gpu simulation and simulation speed attenuation of MACE-OPENMM

ACEsuit / mace

MACE - Fast and accurate machine learning interatomic potentials with higher order equivariant message passing.

Other

473 stars 180 forks source link

Multi-gpu simulation and simulation speed attenuation of MACE-OPENMM #181

Open clecust opened 11 months ago

clecust commented 11 months ago

Hi, I found two problems when performing molecular dynamics simulations using MACE-OPENMM.

Multiple GPU cards can be applied without error. But there is only one GPU card running the operation.
I found that as the simulation progressed, the speed of the simulation gradually decreased, decaying to about half of the initial speed after 24h, and this was independent of fluctuations in density (the test system was an organic system with about 10,000 atoms).

davkovacs commented 11 months ago

The OpenMM implementation only supports single GPU MD, for multi-GPU you can try the LAMMPS MACE calculator.

@jharrymoore any idea why the slow down might be?

jharrymoore commented 11 months ago

Hi @clecust, is your simulation writing out a netcdf file? I believe the issue you are seeing is related to a memory leak in the netcdf library, hence this reporter was removed . Could I check that your version of openmmtools is up to date with the main branch?

clecust commented 11 months ago

Hi @clecust, is your simulation writing out a netcdf file? I believe the issue you are seeing is related to a memory leak in the netcdf library, hence this reporter was removed . Could I check that your version of openmmtools is up to date with the main branch?

Sorry, I'm not very familiar with netcdf files. I only save PDB and CSV files using PDBReporter and StateDataReporter, respectively. PDB files are saved every 100 steps. Here is my relevant openmm-torch version information:

openmm 8.0.0rc2 py310h2996cf7_0 conda-forge/label/openmm_rc openmm-torch 1.0rc1 cuda112py310h93f1983_0 conda-forge/label/openmm-torch_rc