Open potus28 opened 1 year ago
Hi @potus28 , for now this out-of-memory error can only be solved by switching to a node with bigger memory. We have an MPI version of SGP implemented, which allows to scale to multiple nodes, however, that is only supported for offline training, and have not been integrated to on-the-fly training yet. We are still exploring other options but currently not developed yet for users usage.
I think you can try a few more fresh start with different conditions/initializations, and collect some data. Afterwards, you can use the offline training on all the data you have collected from multiple on-the-fly trainings. Then you can either try the offline training as shown in our colab tutorial or our beta version of the MPI SGP if needed.
Hi @YuuuXie , thank you so much for the reply and your suggestions. Just to make sure I understand, the offline training that you are referring to is from section 4 of this tutorial. For doing this, I would create an extxyz file that contains ground truth coordinates and forces for each frame that I want to include in the SGP. After creating the extxyz file, I would do the training as instructed in the tutorial. Is this correct? Thanks!
@potus28 yes exactly.
Awesome! Thank you so much and have a great day!
Hi @YuuuXie,
In the Section 4 of the tutorial (mentioned above by @potus28), how do we do the optimisation for hyper-parameters after the offline training step with large values of train_hyps? Much appreciated!
Describe the bug The system I am trying to simulate is quite large with 516 atoms total of furfural, furfural derivatives, surface adsorbed hydrogen, and molecular hydrogen over a molybdenum carbide surface. While I can run the on-the-fly simulation for 2432 steps with 180 calls to DFT, after the 180th call I get this error message when the program tries to update the SGP:
I'm a bit stuck on what I can do to simulate this system for longer. Any help or suggestions for what I can do to simulate this system for at least 10000 frames (5ps) without running out of memory would be much appreciated. Thanks!
To Reproduce Steps to reproduce the behavior:
from ase import units, Atoms from ase.io import read, write from ase.io.trajectory import Trajectory from ase.spacegroup import crystal from ase.build import surface, add_adsorbate from ase.visualize import view from ase.md.velocitydistribution import MaxwellBoltzmannDistribution, Stationary, ZeroRotation
from flare.bffs.gp import GaussianProcess from flare.utils.parameter_helper import ParameterHelper from flare.bffs.mgp import MappedGaussianProcess from flare.bffs.gp.calculator import FLARE_Calculator from flare.learners.otf import OTF
import flare.bffs.sgp._C_flare as flare_pp from flare.bffs.sgp.sparse_gp import SGP_Wrapper from flare.bffs.sgp.calculator import SGP_Calculator
from ase.calculators.espresso import Espresso, EspressoProfile from ase.calculators.mycalculators import *
np.random.seed(12345)
dft_calc = CP2KCalculator(ecut=400, functional="rVV10", scf=30, orbital_transform=True)
Create the system
def add_molecules(atoms, adsorbate, n=1, tolerance=2.0): """Try to insert n number of probe molecules"""
Create initial structure.
vac_size = 10.0 nx = 4 ny = 5 atoms = surface(read("../../../resources/mp-1221498-Mo2C.cif"), indices=(1, 0, 1), layers=6, vacuum=vac_size) atoms.pbc=True
add_adsorbate(atoms, "H", 1.5, offset=(0.5,0.5))
atoms_1 = atoms.repeat((nx,ny,1))
add_adsorbate(atoms_1, read("../../../resources/FAL.xyz"), 4.0, offset=(0.5,0.5)) add_molecules(atoms_1, read("../../../resources/FAL.xyz"), n=1, tolerance=3.0) add_molecules(atoms_1, read("../../../resources/MF.xyz"), n=2, tolerance=3.0) add_molecules(atoms_1, read("../../../resources/FOL.xyz"), n=2, tolerance=3.0) add_molecules(atoms_1, read("../../../resources/H2.xyz"), n=32, tolerance=3.0)
Make the cell upper triangular.
pos_1 = atoms_1.positions cell_1 = np.array(atoms_1.cell) symbols_1 = atoms_1.symbols
positions = np.zeros(pos_1.shape) cell = np.zeros((3, 3)) n_atoms = len(atoms_1)
cell[0, 0] = cell_1[2, 2] cell[1, 1] = cell_1[1, 1] cell[1, 2] = cell_1[1, 0] cell[2, 2] = cell_1[0, 0]
for n in range(n_atoms): pos = pos_1[n] new_pos = np.zeros(3) new_pos[0] = pos[2] new_pos[1] = pos[1] new_pos[2] = pos[0] positions[n] = new_pos
atoms = Atoms(symbols=symbols_1, positions=positions, cell=cell, pbc=True) n_atoms = len(atoms) species_map = {6: 0, 42: 1, 8: 2, 1: 3} # Oxygen (atomic number 8) is species 2 single_atom_energies ={0:-146.53823039765, 1:-1850.7999738766, 2:-431.82189234201, 3:-12.731577079986}
Randomly jitter the atoms to give nonzero forces in the first frame.
jitter_factor = 0.1 for atom_pos in atoms.positions: for coord in range(3): atom_pos[coord] += (2 np.random.random() - 1) jitter_factor
MD Settings
temperature = 1500 timestep = 0.0005 # ps number_of_steps = 100000 externalstress = 0 ttime = 25 units.fs ptime = 75units.fs bm = 368.75 * units.GPa # O-Mo2C:314.31 :: H-Mo2C:368.75
if ptime is not None: pfactor = bm * ptime**2 else: pfactor = None
md_engine = "NPT" md_dict = { "temperature_K": temperature, "externalstress": externalstress, "ttime": ttime, "pfactor": pfactor, "mask": (0, 1, 1), }
MaxwellBoltzmannDistribution(atoms, temperature_K=temperature) Stationary(atoms, preserve_temperature=False) ZeroRotation(atoms, preserve_temperature=False)
Create sparse GP model.
For hyperparameters, we need to optimize cutoffs, power, N, and lmax
cutoff = 4.25 # in A sigma = 2.0 # in eV power = 2 # power of the dot product kernel kernel = flare_pp.NormalizedDotProduct(sigma, power) cutoff_function = "quadratic" many_body_cutoffs = [cutoff] radial_basis = "chebyshev" radial_hyps = [0., cutoff] cutoff_hyps = [] n_species = len(species_map) N = 8 lmax = 3 descriptor_settings = [n_species, N, lmax] descriptor_calculator = flare_pp.B2( radial_basis, cutoff_function, radial_hyps, cutoff_hyps, descriptor_settings )
Set the noise values.
sigma_e = 0.001 * n_atoms # eV (1 meV/atom) sigma_f = 0.05 # eV/A sigma_s = 0.0006 # eV/A^3 (about 0.1 GPa)
Choose uncertainty type.
Other options are "DTC" (Deterministic Training Conditional) or
"SOR" (Subset of Regressors).
variance_type = "local" # Compute uncertainties on local energies (normalized)
Choose settings for hyperparameter optimization.
max_iterations = 10 # Max number of BFGS iterations during optimization opt_method = "L-BFGS-B" # Method used for hyperparameter optimization
Bounds for hyperparameter optimization.
Keeps the energy noise from going to zero.
bounds = [(sigma, None), (sigma_e, None), (sigma_f, None), (sigma_s, None)]
Create a model wrapper that is compatible with the flare code.
gp_model = SGP_Wrapper( [kernel], [descriptor_calculator], cutoff, sigma_e, sigma_f, sigma_s, species_map, variance_type=variance_type, energy_training=True, force_training=True, stress_training=True, single_atom_energies=single_atom_energies, opt_method=opt_method, bounds=bounds, max_iterations=max_iterations)
Create an ASE calculator based on the GP model.
flare_calculator = SGP_Calculator(gp_model)
init_atoms = list(range(n_atoms)) # Initial environments to include in the sparse set output_name = 'flare' # Name of the output file std_tolerance_factor = -0.2 # Uncertainty tolerance for calling DFT ( 0.01 for 1 species, 0.05 for 2, 0.1 for 3, etc train_hyps = (20,30) # Train hyperparameters optimization within this many DFT calls min_steps_with_model = 2 # Min number of steps between DFT calls update_style = "threshold" # Strategy for adding sparse environments update_threshold = 0.02 # Threshold for determining which sparse environments to add force_only = False # Use only forces for training, when False also include energies and stress (Default is True)
otf_params = { 'init_atoms': init_atoms, 'output_name': output_name, 'train_hyps': train_hyps, 'std_tolerance_factor': std_tolerance_factor, 'min_steps_with_model': min_steps_with_model, 'update_style': update_style, 'force_only': force_only, 'update_threshold': update_threshold, 'write_model': 3 }
otf = OTF( atoms, dt = timestep, number_of_steps = number_of_steps, flare_calc = flare_calculator, dft_calc = dft_calc, md_engine = md_engine, md_kwargs = md_dict, **otf_params )
otf.run()
!/bin/bash
SBATCH --job-name=flare
SBATCH --output=flare.o%j
SBATCH --error=flare.e%j
SBATCH --nodes=10
SBATCH --ntasks-per-node=40
SBATCH --time=48:00:00
SBATCH --partition=400p48h
SBATCH --qos=funded
SBATCH --exclusive
. "$CONDA_PREFIX/etc/profile.d/conda.sh" ulimit -s unlimited export OMP_NUM_THREADS=1 echo "SLURM TASKS: $SLURM_NTASKS"
module purge module load gcc/11.3.0 module load openmpi/4.0.4 module load contrib/0.1 module load cp2k/2023.1 module load mkl/2020.2
export ASE_CP2K_COMMAND="srun cp2k_shell.psmp"
export LD_PRELOAD=$MKLROOT/lib/intel64/libmkl_core.so export LD_PRELOAD=$LD_PRELOAD:$MKLROOT/lib/intel64/libmkl_sequential.so export LD_PRELOAD=$LD_PRELOAD:$MKLROOT/lib/intel64/libmkl_def.so export LD_PRELOAD=$LD_PRELOAD:$MKLROOT/lib/intel64/libmkl_avx2.so export LD_PRELOAD=$LD_PRELOAD:$MKLROOT/lib/intel64/libmkl_intel_lp64.so export LD_PRELOAD=$LD_PRELOAD:$MKLROOT/lib/intel64/libmkl_intel_thread.so
python run.py
Expected behavior I expect the simulation to run for at least 10000 frames with no memory issues.
Desktop (please complete the following information):