GPU-enabled ANI-2x potential for MD simulations in ASE

shengjie-tang commented 1 year ago

Dear Gao and TorchANI developers,

I ran into an issue with ANI to run the MD simulation using GPU. Attached is my code in case you wanna have a look. The question is that, how to attach the GPU to the ANI2x 'calculator' correctly when performing the MD simulation because I did not notice any usage of my GPU when I turned on my task manager. (even though I did):

device = torch.device('cuda:0') calculator = torchani.models.ANI2x().to(device).ase() # import ANI-2x atoms.set_calculator(calculator) # Specify ANI-2x as the calculator

My laptop has a CPU: AMD Ryzen 9 5900HX and my GPU is NVIDIA GeForce RTX 3060 Laptop GPU. Here is a complete version of my simulation code. data file is attached as well.

from ase.io.lammpsdata import read_lammps_data from ase.md.npt import NPT from ase.io.lammpsdata import write_lammps_data from ase.md.velocitydistribution import MaxwellBoltzmannDistribution from ase.md import MDLogger from ase import units import torchani from ase import * import torch

atoms = read_lammps_data('PolyBMIM_TFSI_1_npt2.data', style='full', units='real', Z_of_type={1: 6, 2: 6, 3: 6, 4: 6, 5: 6, 6: 6, 7: 6, 8: 6, 9: 6, 10: 6, 11: 1, 12: 1, 13: 1, 14: 1, 15: 1, 16: 1, 17: 1, 18: 1, 19: 1, 20: 1, 21: 7, 22: 7, 23: 6, 24: 9, 25: 7, 26: 8, 27: 16})

print(len(atoms), "atoms in the cell") # number of atoms in the data file atoms.set_pbc((True, True, True)) # impose the periodic boundary conditions print(atoms.get_cell().volume, 'Angstrom^3') # size of the box

device = torch.device('cuda:0') calculator = torchani.models.ANI2x().to(device).ase() # import ANI-2x atoms.set_calculator(calculator) # Specify ANI-2x as the calculator

T = 300 # temperature in Kelvin timestep = 1 units.fs # timestep set to 1 fs pressure = 1 units.bar # pressure = 1 bar during NPT interval = 100 # how many timesteps print one row of data tpro = 100000 # run this many of timestep in npt production run

MaxwellBoltzmannDistribution(atoms, temperature_K=T, communicator=None, force_temp=True, rng=None) # initialize the velocity of the atoms

call_count = 0 # global variable to keep track of the function printenergy

def printenergy(a=atoms, file_name="PolyBMIM_TFSI_1_ANI_self_defined.log"): global call_count # we need to declare the variable as global to change it call_count += 1 # increment the call count epot = a.get_potential_energy() ekin = a.get_kinetic_energy() / len(atoms) ekin_all = a.get_kinetic_energy() volume = atoms.get_volume() mass = atoms.get_masses().sum() density = mass/volume energy_info = 'System Info: Epot = %.3f eV Ekin = %.3f eV (T=%3.0f K) ' \ 'Etot = %.3f eV Volume = %.3f Angstrom^3 Density = %.3f g/cm^3' % (epot, ekin_all, ekin / (1.5 units.kB), epot + ekin, volume, density 1.66053906660e-24 / (1e-8)*3) print(energy_info) with open(file_name, "a") as f: # "a" means append mode f.write(f'{call_count interval}: {energy_info}\n') # write call_count and energy_info to the file

print("Beginning NPT production run...")

dyn = NPT(atoms, timestep=1 units.fs, externalstress=(-1 units.bar, -1 units.bar, -1 units.bar, 0, 0, 0), ttime=2 units.fs, pfactor=2 units.fs, temperature_K=T, trajectory='PolyBMIM_TFSI_1_ANI.traj', mask=([1, 0, 0], [0, 1, 0], [0, 0, 1]), append_trajectory=True)

logger_production = MDLogger(dyn, atoms, 'PolyBMIM_TFSI_1_ANI.log', header=True, stress=True, mode="a") dyn.attach(logger_production, interval=interval) dyn.attach(printenergy, interval=interval) printenergy() dyn.run(tpro)

write_lammps_data('data.PolyBMIM_TFSI_1_ANI.final', atoms, velocities=False, specorder=None, force_skew=False, prismobj=None, units='real', atom_style='full') # write final structure to data file`

I am not very sure how to check if GPU is used during running this. Could you please also provide some info on that as well? Thank you so much for your help and look forward to your reply. Much appreciated! (change the data file into .data format before running code.)

PolyBMIM_TFSI_1_npt2.data.txt

isayev commented 1 year ago

Dear @shengjie-tang thank you for using torchani & ANI models. This is primarily a problem of ASE, the integrator, thermostat, barostat, etc, are written in plain Python on the CPU. When you run your dynamics it takes ~10ms for ANI model to evaluate forces if you use GPU and the code is bottlenecked by ASE calls. This is why you don't see any GPU load. We are working on the LAMMPS plugin, alternatively, you could also try using ANI with OpenMM.

shengjie-tang commented 1 year ago

Dear @shengjie-tang thank you for using torchani & ANI models. This is primarily a problem of ASE, the integrator, thermostat, barostat, etc, are written in plain Python on the CPU. When you run your dynamics it takes ~10ms for ANI model to evaluate forces if you use GPU and the code is bottlenecked by ASE calls. This is why you don't see any GPU load. We are working on the LAMMPS plugin, alternatively, you could also try using ANI with OpenMM.

Thank you very much Professor Isayev, I think I can do MD simulation in ASE for multiple runs and append trajectories to form a longer simulation trajectory and do analysis. Maybe I missed it, but does OpenMM-ANI plugin support ani-2x model? Since my system contains F and S atoms. I do know there is an ANI plugin in OpenMM. Much appreciated!

UnixJunkie commented 9 months ago

In bash shell:

function gpu-watch () {
    watch nvidia-smi -a --display=utilization
}

aiqm / torchani

GPU-enabled ANI-2x potential for MD simulations in ASE #631