ACEsuit / mace

MACE - Fast and accurate machine learning interatomic potentials with higher order equivariant message passing.
Other
481 stars 181 forks source link

Change the config_from_atoms function such that you can also run .traj files and other trajectory file types #106

Closed mahpe closed 1 year ago

mahpe commented 1 year ago

Hi I would like to use MACE with .traj files instead of xyz files. I think it can be made possible if the forces, stress, energies and charges are extracted from the Atom objects using the commands instead of the arrays and info dictionaries. I suggest to change the function to this:

def config_from_atoms( atoms: ase.Atoms, energy_key="energy", forces_key="forces", stress_key="stress", virials_key="virials", dipole_key="dipole", charges_key="charges", config_type_weights: Dict[str, float] = None, ) -> Configuration: """Convert ase.Atoms to Configuration""" if config_type_weights is None: config_type_weights = DEFAULT_CONFIG_TYPE_WEIGHTS energy = atoms.info.get(energy_key, atoms.get_potential_energy()) # eV forces = atoms.arrays.get(forces_key, atoms.get_forces()) # eV / Ang stress = atoms.info.get(stress_key, atoms.get_stress()) # eV / Ang virials = atoms.info.get(virials_key, None) dipole = atoms.info.get(dipole_key, atoms.get_dipole_moment()) # Debye

Charges default to 0 instead of None if not found

try:
    charges = atoms.arrays.get(charges_key, atoms.get_charges())  # atomic unit
except:
    charges = np.zeros(len(atoms))
atomic_numbers = np.array(
    [ase.data.atomic_numbers[symbol] for symbol in atoms.symbols]
)
pbc = tuple(atoms.get_pbc())
cell = np.array(atoms.get_cell())
config_type = atoms.info.get("config_type", "Default")
weight = atoms.info.get("config_weight", 1.0) * config_type_weights.get(
    config_type, 1.0
)
energy_weight = atoms.info.get("config_energy_weight", 1.0)
forces_weight = atoms.info.get("config_forces_weight", 1.0)
stress_weight = atoms.info.get("config_stress_weight", 1.0)
virials_weight = atoms.info.get("config_virials_weight", 1.0)
# fill in missing quantities but set their weight to 0.0
if energy is None:
    energy = 0.0
    energy_weight = 0.0
if forces is None:
    forces = np.zeros(np.shape(atoms.positions))
    forces_weight = 0.0
if stress is None:
    stress = np.zeros(6)
    stress_weight = 0.0
if virials is None:
    virials = np.zeros((3, 3))
    virials_weight = 0.0
return Configuration(
    atomic_numbers=atomic_numbers,
    positions=atoms.get_positions(),
    energy=energy,
    forces=forces,
    stress=stress,
    virials=virials,
    dipole=dipole,
    charges=charges,
    weight=weight,
    energy_weight=energy_weight,
    forces_weight=forces_weight,
    stress_weight=stress_weight,
    virials_weight=virials_weight,
    config_type=config_type,
    pbc=pbc,
    cell=cell,
)
davkovacs commented 1 year ago

Hi! Sorry, I am not sure I understand what you changed? could you perhaps explain a little more?

mahpe commented 1 year ago

Actually I think there might be a mistake in the function I wrote. It works fine for training, but in the MD prediction it gives an error.

The hole idea of my implementation was to be able to use ASE trajectory files :) Do you think that can be made possible?

ilyes319 commented 1 year ago

Could you post the error you get when trying to load your traj file? If ase.io.read works the same way on them, I am not sure where the problem might be.

mahpe commented 1 year ago

The energy and forces will not be loaded, since they do not have the atoms.info and atoms.arrays dictionary :) The will not be an error by it self but the model is not training.

ilyes319 commented 1 year ago

The problem with the command extraction is that it can be unreliable. If a calculator is attached to the atom object, it can give inconsistent results. For this reason, storing your data in the info with a custom key is highly recommended. In particular, .traj files are usually associated with the calculator that generated them. Is there an easy way to convert the .traj to .xyz? We could add this pre-processing.

bernstei commented 1 year ago

I think traj files store those quantities in a SinglePointCalculator, which is an essentially static storage that fills in the return values for get_potential_energy() etc. In fact, it's probably safer than quantities from extxyz, because I think the calculator is invalidated if you modify positions etc. It's just not possible to do things like save multiple sets of results from different calculators. It would, of course, be trivial to write a little loop that reads the calculator and stores the results in info/arrays. The ase.io extxyz reader probably already does that, I think.

mahpe commented 1 year ago

I think converting traj to xyz file will require mor space since you will have the same data file twice. What Bernstei suggest would be a great solution!