stfc / janus-core

Tools for machine learnt interatomic potentials
https://stfc.github.io/janus-core/
BSD 3-Clause "New" or "Revised" License
9 stars 7 forks source link

Add correlator with stress correlations #189

Closed harveydevereux closed 1 month ago

harveydevereux commented 2 months ago

This PR adds an online correlation algorithm (multi-tau), with stress correlations as user observables. The correlations are,

$$C{ab}(t) = \frac{1}{T-t}\sum{t'=1}^{T-t}a(t)b(t+t'),$$

for some values $a$ and $b$, and time lags $t$. The lags (and timestep) are implied by the frequency the correlator is updated in run controled by correlation_parameters[.][3] (which could probably use being some TypedDict actually...).

A user may ask for a sequence of correlations by,

correlation_kwargs={
            "correlations": [("s_xy", "s_xy")],
            "correlation_parameters": [(1, 10, 1, 1)],
        },

where s_xy is parsed into an Observable that can poll MolecularDynamics::dyn with a get method.

Correlations are written in yaml format for now a list of correlations (name, value, and lags).

New correlations can be added by providing an extension to Observable as is done for stress which connects user input to a function to get values from Union[Langevin, VelocityVerlet, ASE_NPT],

class Stress(Observable):
    def __init__(self, component: str):
        self.component = component
        self._index = self._component_to_index(component, voigt=False)

    def get(self, md: Union[Langevin, VelocityVerlet, ASE_NPT]) -> float:
        return (
            md.atoms.get_stress(include_ideal_gas=True, voigt=False).flatten()[
                self._index
            ]
            / units.bar
        )

    def __str__(self) -> str:
    ...

any combination of observables is a "valid" correlation.

harveydevereux commented 1 month ago

Thanks a lot for this @harveydevereux! It looks really useful, and I really appreciate the detailed description too. Apologies for taking so long to take a proper look.

I'm in the process of a longer code review, but a couple of initial thoughts:

Is there a way of generalising the observables through the SinglePoint class (or otherwise)? md.atoms.get_stress should just be using the MLIP calculator attached, so I think it ought to be able to slot in, and it would mean we could easily, if not immediately, add additional observables. Perhaps the additional options/flattening etc. makes this less straightforward than I'm imagining though.

Going slightly further, if we were to use this for velocity autocorrelation, would we expect this to match the post processing VAF we can already calculate? It would be interesting to compare, if so, but that could be something to come back to.

Thanks @ElliottKasoar I will take a look at the SinglePoint class, but it seems like a good way to go for observables.

On VAF this won't get the same results as VAF currently (I think) as it uses np.correlate with same rather than full. Full gets the $C_{ab}$ I wrote above averaging across all points in the series considered as origins for the correlation.

np.correlate(momenta[:, j, i], momenta[:, j, i], "same")
ElliottKasoar commented 1 month ago

We should also think about how best to document this, although it may be best to iron out the details a bit first.

Ideally this would probably go in the full user guide, but for now perhaps @harveydevereux you could add a brief example of its use to the MD section of the README, since that'll be the starting point for the full docs.

It would also be great to have something for how you'd go about extending this to additional observables in the developer guide, but that can probably be spun off into a new issue (potentially through a worked example, similar to adding ALIGNN in #194)

harveydevereux commented 1 month ago

Looking good!

The other comments I've added are pretty minor and all optional.

A few things that would be useful to clarify:

  1. Do we not want a CLI interface for this?
  2. Did we not want some built-in observables (e.g. user_observable_a/stress), which is probably a requirement for (1)
  3. Can the use of this be documented in the README in some form (if we do (1) and (2), an example of the CLI with a description of the kwargs is probably sufficient)

Maybe this can all be a separate PR, but I don't think a user currently would discover or easily figure out how to use this as it is.

Definitely we document in this pr, I wanted to be sure the api/features were settled first, I think we are approaching that point but where do the (2) built-in ones go? I thought something like

import stress from janus_core.helpers.correlator.observables

would not seem insane, but @alinelena has opinions, it would mean correlator becoming a folder

For CLI its interesting, do we want the user to be able to specify just built-ins? Or maybe have arbitrary code evaluated from a eval("some code"), or somehow import python files

oerc0122 commented 1 month ago

Looking good! The other comments I've added are pretty minor and all optional. A few things that would be useful to clarify:

  1. Do we not want a CLI interface for this?
  2. Did we not want some built-in observables (e.g. user_observable_a/stress), which is probably a requirement for (1)
  3. Can the use of this be documented in the README in some form (if we do (1) and (2), an example of the CLI with a description of the kwargs is probably sufficient)

Maybe this can all be a separate PR, but I don't think a user currently would discover or easily figure out how to use this as it is.

Definitely we document in this pr, I wanted to be sure the api/features were settled first, I think we are approaching that point but where do the (2) built-in ones go? I thought something like

import stress from janus_core.helpers.correlator.observables

would not seem insane, but @alinelena has opinions, it would mean correlator becoming a folder

For CLI its interesting, do we want the user to be able to specify just built-ins? Or maybe have arbitrary code evaluated from a eval("some code"), or somehow import python files

evaling code is not a good plan as if you're writing that on the command line why aren't you writing it in a python script? CLI will always be more restrictive than direct interfaces. We may want to build a correlation function factory which imports and constructs a standard Correlator class from a string by importing the appropriate libraries allowing a user to define their Correlator class or correlation function in that folder and import it dynamically.

In general they should be able to specify arguments in a way consistent with the rest of the code using the typer interfaces.

I should note I would consider this discussion would be more useful after we've got even the basic correlation functionality merged and plan a restructure at a later date.

harveydevereux commented 1 month ago

Looking good! The other comments I've added are pretty minor and all optional. A few things that would be useful to clarify:

  1. Do we not want a CLI interface for this?
  2. Did we not want some built-in observables (e.g. user_observable_a/stress), which is probably a requirement for (1)
  3. Can the use of this be documented in the README in some form (if we do (1) and (2), an example of the CLI with a description of the kwargs is probably sufficient)

Maybe this can all be a separate PR, but I don't think a user currently would discover or easily figure out how to use this as it is.

Definitely we document in this pr, I wanted to be sure the api/features were settled first, I think we are approaching that point but where do the (2) built-in ones go? I thought something like

import stress from janus_core.helpers.correlator.observables

would not seem insane, but @alinelena has opinions, it would mean correlator becoming a folder For CLI its interesting, do we want the user to be able to specify just built-ins? Or maybe have arbitrary code evaluated from a eval("some code"), or somehow import python files

evaling code is not a good plan as if you're writing that on the command line why aren't you writing it in a python script? CLI will always be more restrictive than direct interfaces. We may want to build a correlation function factory which imports and constructs a standard Correlator class from a string by importing the appropriate libraries allowing a user to define their Correlator class or correlation function in that folder and import it dynamically.

In general they should be able to specify arguments in a way consistent with the rest of the code using the typer interfaces.

I'm aware it is a bad idea to do it that way, my point is this PR ill become quite large if we want to figure out all of that out here as well as implementing a correlator. The factory sounds like a good idea to do it but as I think you agree we move it to later work.

oerc0122 commented 1 month ago

as I think you agree we move it to later work.

@harveydevereux Indeed, see the edit I made between you "quote replying" and posting.

ElliottKasoar commented 1 month ago

Definitely we document in this pr, I wanted to be sure the api/features were settled first, I think we are approaching that point but where do the (2) built-in ones go? I thought something like

import stress from janus_core.helpers.correlator.observables

would not seem insane, but @alinelena has opinions, it would mean correlator becoming a folder

For CLI its interesting, do we want the user to be able to specify just built-ins? Or maybe have arbitrary code evaluated from a eval("some code"), or somehow import python files

I would suggest a janus_core.helpers.observables module is probably fine for now. Helpers needs splitting up anyway (#144) so we can worry about subdirectories later if it makes sense.

I definitely agree that user-defined observables doesn't need to be addressed in this PR, and it may never be possible via the CLI.

How I (and I think @alinelena) imagined the CLI was probably similar to what we do for functions in geom_opt.py, where we call getattr(module, function_name) to import a function based on its name from a module (e.g. observables).

For the CLI, I'd imagine this would be passed through kwargs (--correlation-kwargs "{'obs_1': 'function_name', ...}", since otherwise it pollutes the main options too much.

We can come back to that though.

The main point was that without pre-defined functions, and the CLI that uses them, I don't think it's realistic to write simple documentation, because using this won't be simple.

If we want to get this merged and out of the way, perhaps @harveydevereux you could add something to the dev tutorial for how you approached writing the custom functions, which will remain useful even with additions that make getting started easier.

harveydevereux commented 1 month ago

Looks great, thanks @harveydevereux!

Just a couple of minor typos/formatting suggestions for the docs

Thanks for catching these @ElliottKasoar, I forgot to do the proof read...