lab-cosmo / metatensor

Self-describing sparse tensor data format for atomistic machine learning and beyond
https://docs.metatensor.org
BSD 3-Clause "New" or "Revised" License
44 stars 14 forks source link

ASE interface re-runs the model multiple times at each step #655

Open frostedoyster opened 2 weeks ago

frostedoyster commented 2 weeks ago

This is how ASE calls calculators to get properties:

    def get_property(self, name, atoms=None, allow_calculation=True):
        if name not in self.implemented_properties:
            raise PropertyNotImplementedError('{} property not implemented'
                                              .format(name))

        if atoms is None:
            atoms = self.atoms
            system_changes = []
        else:
            system_changes = self.check_state(atoms)
            if system_changes:
                self.reset()
        if name not in self.results:
            if not allow_calculation:
                return None
            self.calculate(atoms, [name], system_changes)

        if name not in self.results:
            # For some reason the calculator was not able to do what we want,
            # and that is OK.
            raise PropertyNotImplementedError('{} not present in this '
                                              'calculation'.format(name))

        result = self.results[name]
        if isinstance(result, np.ndarray):
            result = result.copy()
        return result

Essentially one property at a time. This means that our model will be called once to get the forces at each MD step. If the user wants to know the potential energy, the calculator will be called again. If the simulation is NPT and a stress is required, this generates another call. Dipoles would be yet another call and so on.

This is a consequence of the fact that our calculator stores as results only the requested property. Therefore no caching happens and the calculator has to be queried again.

frostedoyster commented 2 weeks ago

A custom get_property for our calculator might be one way to solve the issue

Luthaf commented 2 weeks ago

We could also store the energy/energies in results whenever we are computing forces/stress. We could also always store the stress when computing forces (and vice versa). The only one I would not do here would be to computed forces/stress if the user only asked for energy (since this is quite a bit more expensive).

Overriding get_property might work, but I would rather only implement what ASE wants us to implement, to prevent the code from breaking in the next release.

sirmarcel commented 2 weeks ago

Heya, I've been called in to comment on this. What's the problem here? As far as I'm aware the expected way to deal with this is to write everything you possibly can into results (in the calculate method) so it doesn't have to be recomputed. So a get_forces should populate results with the potential energy (and maybe the stress also).