Machine-Learning-Dynamical-Systems / kooplearn

A Python package to learn the Koopman operator.
https://kooplearn.readthedocs.io
MIT License
42 stars 9 forks source link

Deprecated ContextWindowDataset.observables attribute? #10

Open Danfoa opened 5 months ago

Danfoa commented 5 months ago

I noticed that the BaseModel class makes reference to an observables attribute from the ContextWindowDataset class, which seems to be undefined and undocumented within that class. For example, this issue can be observed here: https://github.com/Machine-Learning-Dynamical-Systems/kooplearn/blob/fe49ab7c24d75c9a45ec301e2397dd99c61c6371/kooplearn/abc.py#L21-L40

The data.observables attribute is neither defined nor documented in the ContextWindowDataset, which leads to confusion. Similarly, the purpose of the predict_observables attribute in the base model’s predict method is unclear.

I suggest that this attribute should either be clearly documented and defined within the ContextWindowDataset class or removed from the function call and documentation to avoid confusion. Generally, dynamically adding attributes to class instances without prior definition or documentation is not a recommended practice.

Additionally, it would be beneficial to establish a clear vocabulary for the types of observable functions. Currently, it's not clear whether "observables" refer to state observables, which are scalar/vector-valued functions analytically computed/measured from the dynamical system, or to state latent observables/features, which are scalar/vector-valued functions we learn or define based on state observables.

Does kooplearn have an established nomenclature for distinguishing these types of observables? @pietronvll @vladi-iit @g-turri Should we make this distinction?

pietronvll commented 5 months ago

Hi, @Danfoa! Thanks for opening this issue. Indeed, the observables attribute from ContextWindowDataset is work-in-progress and undocumented.

The idea of the API is that upon creating an instance of ContextWindowDataset, you can also specify a dictionary of observables associated with the states in the context windows.

As of now, the way to create an observables attribute is to dynamically adding it on an instantiated ContextWindowDataset as

ctxs = ContextWindowDataset(data)
ctxs.observables = {
  'obs_1': data_obs_1,
  'obs_2': data_obs_2
}

which as you say is not a good practice. To address this issue, we can start by modifying TensorContextWindow, https://github.com/Machine-Learning-Dynamical-Systems/kooplearn/blob/3568b403f2cdda34551a575e01ae6a621135521e/kooplearn/data.py#L9 by adding an observables property with getter and setter methods, where the setter methods check that the shape of the first two dimensions of ctxs.data and ctxs.observables['obs_1'] coincide. Recall how the first two dimensions of ctxs.data have shape number of contexts, context length.

It is reasonable to initialize the observables property to None, whenever they are not needed. See also this code snippet, parsing the observable dict to perform observables forecasting in the Linear, Nonlinear and Kernel models. https://github.com/Machine-Learning-Dynamical-Systems/kooplearn/blob/3568b403f2cdda34551a575e01ae6a621135521e/kooplearn/_src/operator_regression/utils.py#L7

I am assigning this issue to myself and @Danfoa. Let's create a branch out of dev and work from it. @GregoirePacreau might be interested in it as well.

GregoirePacreau commented 5 months ago

Just to verify, if i have a series

(X_i, Y_i)_{i \in [T]}

and want to compute

E(f(Y_{T+1}) \vert X_{T+1})

I need to have in the observable dictionary an array containing the $(f(X_i),f(Yi)){i\in [T]}$ (in context form) ?

What would be the behaviour of the modes method when several observables are provided? Will it give a sequence of modes, one for each observable?

And finally, should we allow for functionals in the observables dictionary, or at least have a method that creates the correct array given a functional ?

Danfoa commented 5 months ago

Establishing a shared nomenclature for these concepts is crucial from the outset, especially since the term "observables" remains somewhat ambiguous to me within our context.

In the Kooplearn framework, to ensure clarity both among ourselves and for our users, it's essential to have well-defined terminology and documentation differentiating between:

While the terms I've used are suggestions, our aim should be to clearly delineate these categories in our documentation and nomenclature for transparency and ease of understanding.

pietronvll commented 5 months ago

My definitions, which I try to use throughout kooplearn and also follow the definitions in our papers, are:

States

A state of the dynamical system/stochastic process is usually denoted $x{t}$ (deterministic dynamics) or $X{t}$ (stochastic process). States are defined on the state space $\mathcal{X}$, which is usually $\mathbb{R}^{d}$ or a subset thereof. As the name suggests, a state provides the full knowledge of the system at a given instant in time.

@Danfoa, I know that according to the previous description, states are not defined uniquely. For example, any bijective transformation of a state (which in turn is an observable, see below) is again another state. As the algorithms in kooplearn assume perfect observability of the system, they require context windows of states upon fitting. Apart from being states, however, kooplearn does not impose any restriction on their representation.

In short: a state in kooplearn is any variable giving a full description of the system. Context windows are sequences of observed states, and they are used for fitting and inferencing from kooplearn models.

Observables

Observables are arbitrary functions of states $f : \mathcal{X} \to \mathbb{R}^{p}$. Observables may or may not describe the dynamical system/stochastic process completely. Observables can be used as states if they give a full description. If they do not provide a complete description, however, they should not be used as states (that is, to fit kooplearn models) as it is known that partially observed Markov processes are not markovian in general.

Given these definitions, we can further distinguish between

  1. Learned observables: functions of the state provided by data-driven algorithms. These can be e.g. the eigenfunctions of a kooplearn.models.Kernel model or the learned DPNets feature map kooplearn.models.feature_maps.NNFeatureMap.
  2. Measurements: functions of the state which can be measured experimentally. These can be e.g. the COM velocity of a robot, the volatility of a stock, or the average energy of a molecule.

To answer @GregoirePacreau's question:

  1. Yes, the observable dictionary should contain a tensor with the evaluation of $f$ on the same states contained in the context window.
  2. The modes function now return a tuple (modes_dict, eigs) containing a dictionary with the same structure of ctxs.observables and an array with the eigenvalues corresponding to each of the modes.

https://github.com/Machine-Learning-Dynamical-Systems/kooplearn/blob/3568b403f2cdda34551a575e01ae6a621135521e/kooplearn/models/kernel.py#L395

To take the $i$-th mode of the observable f you should use

ctxs = TensorContextWindow( ... )
ctxs.observables = {
    'f': data_obs_f
}
modes, eigs = model.modes(ctxs)
modes_f = modes['f']

#i-th mode of f
modes_f[i]
pietronvll commented 5 months ago

Something to add to this PR I just noticed:

Handling observables is a bit awkward, as calling predict(test_data, ...) (or modes) looks for a dict-like observables attribute within test_data. This dict-like attribute should contain raw numpy arrays of the observables evaluated on the train data.

This should be fixed: observables should be defined on the train data from the get-go, and the dict-like should contain TensorContextDataset of observables instead of raw numpy arrays

Danfoa commented 5 months ago

Have a look at this DataClass, which I use to store observations (including state) from a given Markov dynamical system.

The key idea is to keep observables separated and named, for instance:

If you want all/some stacked observables you get a view of the data. In general this added structure of knowing which parts of the state are vectors, scalars, etc is needed to handle the system's symmetries. Adding this structure to a ContextWindowLike class, can enable us to log relevant additional metrics, like errors of each of these observables independently (if requested).

That DataClass has become quite helpful for me, maybe there is something there we can adapt to kooplean.

In general to solve the problem of the 5 inheritance classes, I propose to introduce a single ContextWindowLike class which is either a wrapper or direct inherited from Tensor/ndarray. Similar in spirit as GeometricTensor, which wraps a tensor and offers some additional structural attributes and methods. From what I can see, the idea behind introducing the TensorContextDataset class, which is the main interaction of torch with kooplearns data paradigm, is to:

  1. Allow the user to get the past_horizon/preset/lookback and future/prediction_horizon/lookforward without handcrafting the time indexing at every point.
  2. Design pipelines that are based on processing "trajectories/sequences of observables" instead of individual states.
  3. Keep track of the context_len, and (ideally) the features_shape.
  4. Handle automatically the backend.

In practice, when using torch, the TensorContextDataset is already being used as a wrapper for a "trajectory" Tensor/ndarray (correct me if wrong). I think we can design a Tensor wrapper class, in the spirit of GeometricTensor, which covers these 4 features, while is still processed as a tensor by the myriad of torch native functions, which expect Tensors instead of TensorContextDataset.

Dont see the need for the 5 levels of abstraction, which in practice are/will become problematic.