Hi, I am trying to run Grand Canonical Monte Carlo (GCMC) for water adsorption in silica using MACE. The issue is that I have a CUDA out of memory error after few thousands steps (~4000 with one gpu). I also tried using more than one gpu, but I still have cuda out of memory. I am using A100 GPUs 64GB.
Input
This is the input file that I am using. Note that I specify bond and angle styles as harmonic but then the coefficient is 0 in order to be able to run the MC with the water molecule, so at the end the energies that I get are only mace predicted energies.
units metal
boundary p p p
atom_style full
neighbor 1.0 bin
neigh_modify delay 1
pair_style mace no_domain_decomposition
atom_modify map yes
newton on
bond_style harmonic
angle_style harmonic
read_data ../part1/1_SiOwithwater.data
molecule h2omol ../H2O.mol
lattice sc 3
create_atoms 0 box mol h2omol 45585
lattice none 1
group SiO type 1 2
group H2O type 3 4
pair_coeff * * ./MACE_MPtrj_2022.9.model-lammps.pt Si O H O
bond_coeff * 0.0 0.0
angle_coeff * 0.0 0.0
delete_atoms overlap 2 H2O SiO mol yes
# Next 4 lines to count the number of water molecules
variable oxygen atom "type==3"
group oxygen dynamic all var oxygen
variable nO equal count(oxygen)
fix myat1 all ave/time 100 10 1000 v_nO file numbermolecule.dat
## the GCMC step
variable tfac equal 5.0/3.0
variable xlo equal xlo+0.1
variable xhi equal xhi-0.1
variable ylo equal ylo+0.1
variable yhi equal yhi-0.1
variable zlo equal zlo+0.1
variable zhi equal zhi-0.1
region system block ${xlo} ${xhi} ${ylo} ${yhi} ${zlo} ${zhi}
fix fgcmc H2O gcmc 100 100 0 0 65899 300 -0.5 0.1 &
mol h2omol tfac_insert ${tfac} group H2O &
full_energy pressure 10000 region system
run 45000
write_data SiOwithwater.data
write_dump all atom dump.lammpstrj
Running environment
I used the following specifications to build lammps-mace:
RuntimeError: CUDA out of memory. Tried to allocate 6.46 GiB. GPU 0 has a total capacity of 63.42 GiB of which 6.42 GiB is free. Including non-PyTorch memory, this process has 57.00 GiB memory in use. Of the allocated memory 55.10 GiB is allocated by PyTorch, and 302.98 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
I am not sure why I have cuda out of memory error although the size of my system is small, I would appreciate any insights, thanks!
Hi, I am trying to run Grand Canonical Monte Carlo (GCMC) for water adsorption in silica using MACE. The issue is that I have a CUDA out of memory error after few thousands steps (~4000 with one gpu). I also tried using more than one gpu, but I still have cuda out of memory. I am using A100 GPUs 64GB.
Input
This is the input file that I am using. Note that I specify bond and angle styles as harmonic but then the coefficient is 0 in order to be able to run the MC with the water molecule, so at the end the energies that I get are only mace predicted energies.
Running environment
I used the following specifications to build lammps-mace:
When I run, I load these modules:
Error message
This is the error message that I get
I am not sure why I have cuda out of memory error although the size of my system is small, I would appreciate any insights, thanks!