FAIR-Chem / fairchem

FAIR Chemistry's library of machine learning methods for chemistry
https://opencatalystproject.org/
Other
843 stars 252 forks source link

GemNet-OC pre-trained checkpoints failing with gas molecules #426

Closed zulissi closed 6 months ago

zulissi commented 1 year ago

If loading the gemnet-OC OC20+OC22 checkpoint from https://dl.fbaipublicfiles.com/opencatalystproject/models/2022_09/oc22/s2ef/gnoc_oc22_oc20_all_s2ef.pt (and maybe others) and using it for a gas-phase relaxation:

from ase.build import molecule
from ase.optimize import LBFGS
from ocpmodels.common.relaxation.ase_utils import OCPCalculator

calc = OCPCalculator(checkpoint="gnoc_oc22_oc20_all_s2ef.pt")

# Energy for H2O
H2O_atoms = molecule("H2O")
H2O_atoms.set_pbc(True)
H2O_atoms.set_cell([20, 20, 20])
H2O_atoms.set_calculator(calc)
dyn = LBFGS(H2O_atoms)
dyn.run(fmax=0.03, steps=100)
H2O_energy = H2O_atoms.get_potential_energy()
print(f"H2O energy = {H2O_energy:.2f} eV")

there is a RunetimeError in the spherical basis function:

File /opt/ocp/ocpmodels/models/gemnet_oc/layers/spherical_basis.py:136, in SphericalBasisLayer.forward(self, D_ca, cosφ_cab, θ_cabd)
    134 def forward(self, D_ca, cosφ_cab, θ_cabd):
    135     rad_basis = self.radial_basis(D_ca)
--> 136     sph_basis = self.spherical_basis(cosφ_cab, θ_cabd)
    137     # (num_quadruplets, num_spherical**2)
    139     if self.scale_basis:

File /opt/ocp/ocpmodels/models/gemnet_oc/layers/spherical_basis.py:117, in SphericalBasisLayer.__init__.<locals>.<lambda>(cosφ, θ)
    113 elif sbf_name == "legendre_outer":
    114     circular_basis = get_sph_harm_basis(
    115         num_spherical, zero_m_only=True
    116     )
--> 117     self.spherical_basis = lambda cosφ, ϑ: (
    118         circular_basis(cosφ)[:, :, None]
    119         * circular_basis(torch.cos(ϑ))[:, None, :]
    120     ).reshape(cosφ.shape[0], -1)
    122 elif sbf_name == "gaussian_outer":
    123     self.circular_basis = GaussianBasis(
    124         start=-1, stop=1, num_gaussians=num_spherical, **sbf_hparams
    125     )

RuntimeError: cannot reshape tensor of 0 elements into shape [0, -1] because the unspecified dimension size -1 can be any value and is ambiguous

Other checkpoints seem to run fine. For example using the gemnet-t checkpoint gemnet_t_direct_h512_all.pt works fine:

from ase.build import molecule
from ase.optimize import LBFGS
from ocpmodels.common.relaxation.ase_utils import OCPCalculator

calc = OCPCalculator(checkpoint="gemnet_t_direct_h512_all.pt")

# Energy for H2O
H2O_atoms = molecule("H2O")
H2O_atoms.set_pbc(True)
H2O_atoms.set_cell([20, 20, 20])
H2O_atoms.set_calculator(calc)
dyn = LBFGS(H2O_atoms)
dyn.run(fmax=0.03, steps=100)
H2O_energy = H2O_atoms.get_potential_energy()
print(f"H2O energy = {H2O_energy:.2f} eV")

and yields

       Step     Time          Energy         fmax
LBFGS:    0 18:36:46       -0.003679        0.2646
LBFGS:    1 18:36:46       -0.010316        0.0730
LBFGS:    2 18:36:46       -0.010441        0.0246
H2O energy = -0.01 eV
AdeeshKolluru commented 1 year ago

I think it's because the pre-trained checkpoint is trained with quadruplets and we can't calculate that for H20. So this checkpoint would not work for systems with number of atoms < 4

jkitchin commented 1 year ago

I see a similar error with bulk systems:

from ocpmodels.common.relaxation.ase_utils import OCPCalculator
from ase import Atom, Atoms
import numpy as np

cp = "gemnet_oc_large_s2ef_all_md.pt"
calc = OCPCalculator(checkpoint=cp, trainer='energy')

# fcc
a = 3.6
atoms = Atoms([Atom('Pt', (0, 0, 0))],
                      cell=0.5 * a * np.array([[1.0, 1.0, 0.0],
                                               [0.0, 1.0, 1.0],
                                               [1.0, 0.0, 1.0]]),
                 pbc=True)
atoms = atoms.repeat((3, 3, 3))

atoms.calc = calc

print(atoms.get_potential_energy())

eventually errors out with RuntimeError: cannot reshape tensor of 0 elements into shape [0, -1] because the unspecified dimension size -1 can be any value and is ambiguous

that unit cell has 27 atoms in it.

AdeeshKolluru commented 1 year ago

Since you didn't set the tags for this system, by default it assumes it to be 0 and doesn't calculate quadruplets. If you could do something like

tags = np.ones(len(atoms))
atoms.set_tags(tags)

it should give you the required result

wakamiya0315 commented 1 year ago

That the GemNet-OC pre-trained checkpoints cannot be used with gas molecules with <4 atoms, Does this mean that the adsorption energies(between bulk and gas) cannot be calculated with the OCP Calculator using the published OC22 trained models (gemnet-OC, OC20+OC22 and OC20->OC22)?

AdeeshKolluru commented 1 year ago

We calculate gas phase energies by using per-atomic energy contributions which we get from a linear combination of energies of N2, H2O, CO, and H2. These pre-calculated per-atomic energies can be found in the supplementary of both papers. Here are the ones for OC22 - image

wakamiya0315 commented 1 year ago

Based on what you have told me, I have created a code to get the adsorption energy with CO on a Pt slab.

If there are any mistakes, please correct or let me know.

# import

import numpy as np
from ase import Atoms
from ase.build import fcc111, add_adsorbate
from ase.constraints import FixAtoms
from ase.build import fcc111, molecule, add_adsorbate
from ase.optimize import LBFGS
from ocpmodels.common.relaxation.ase_utils import OCPCalculator

# calculator setting

!wget -q https://dl.fbaipublicfiles.com/opencatalystproject/models/2022_09/oc22/s2ef/gnoc_oc22_oc20_all_s2ef.pt
checkpoint_path = "/content/ocp/gnoc_oc22_oc20_all_s2ef.pt"
config_yml_path = "configs/oc22/s2ef/gemnet-oc/gemnet_oc_oc20_oc22.yml"
calculator = OCPCalculator(config_yml=config_yml_path, checkpoint=checkpoint_path)

# slab(Pt)

slab =  fcc111("Pt", size=(4, 4, 4), vacuum=40.0, periodic=True)
slab.calc = calculator
opt = LBFGS(slab)
opt.run(fmax=0.01, steps=100)
E_slab = slab.get_potential_energy()

# gas(CO)

E_gas = (-7.332)+(-7.459)

# slab+gas(Pt + CO)

adslab = fcc111("Pt", size=(4, 4, 4))
adsorbate = molecule("CO")
add_adsorbate(adslab, adsorbate, 3, "bridge")
tags = np.zeros(len(adslab))
tags[48:64] = 1
tags[64:] = 2
adslab.set_tags(tags)
cons= FixAtoms(indices=[atom.index for atom in adslab if (atom.tag == 0)])
adslab.set_constraint(cons)
adslab.center(vacuum=40.0, axis=2)
adslab.set_pbc(True)

adslab.calc = calculator
opt = LBFGS(adslab)
opt.run(fmax=0.01, steps=100)
E_slab_gas = adslab.get_potential_energy()

# calculate E_adsorp

E_adsorp = E_slab_gas - (E_slab + E_gas)
print(f"E_adsorp {E_adsorp}, E_slab_gas {E_slab_gas}, E_slab {E_slab}, E_gas {E_gas}")
AdeeshKolluru commented 1 year ago

The logic looks fine to me. Few minor details to keep in mind to get improved accuracies and/or to stay more consistent with OC20: