dynamicslab / pysindy

A package for the sparse identification of nonlinear dynamical systems from data
https://pysindy.readthedocs.io/en/latest/
Other
1.4k stars 308 forks source link

[BUG] Ensemble Models with Constrained SR3 #267

Open akashyap13 opened 1 year ago

akashyap13 commented 1 year ago

Hi PYSINDy,

I am having trouble with getting the ensemble optimizer working with the constrained sr3 module. When I include library ensemble it errors out, and when I use bagging it solves the system without using the constraints.

I have attached an example code consisting of the examples provided with pysindy to illustrate these two points. I might have made an error, so please let me know if I missed an easy fix.

Thanks for your help Amrit

python 3.9.13 pysindy 1.7.2

CODE

import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
from scipy.integrate import solve_ivp
from sklearn.linear_model import Lasso

from pysindy.utils import enzyme
from pysindy.utils import lorenz
from pysindy.utils import lorenz_control
import pysindy as ps

# Initialize integrator keywords for solve_ivp to replicate the odeint defaults
integrator_keywords = {}
integrator_keywords['rtol'] = 1e-12
integrator_keywords['method'] = 'LSODA'
integrator_keywords['atol'] = 1e-12

dt = .002

t_train = np.arange(0, 10, dt)
x0_train = [-8, 8, 27]
t_train_span = (t_train[0], t_train[-1])
x_train = solve_ivp(lorenz, t_train_span, x0_train,
                    t_eval=t_train, **integrator_keywords).y.T

feature_names = ['x', 'y', 'z']
library = ps.PolynomialLibrary()
library.fit([ps.AxesArray(x_train,{"ax_sample":0,"ax_coord":1})])
n_features = library.n_output_features_

# Set constraints
n_targets = x_train.shape[1]
constraint_rhs = np.array([0, 28])

# One row per constraint, one column per coefficient
constraint_lhs = np.zeros((2, n_targets * n_features))

# 1 * (x0 coefficient) + 1 * (x1 coefficient) = 0
constraint_lhs[0, 1] = 1
constraint_lhs[0, 2] = 1

# 1 * (x0 coefficient) = 28
constraint_lhs[1, 1 + n_features] = 1

sr3_ensemble = ps.EnsembleOptimizer(ps.ConstrainedSR3(constraint_rhs=constraint_rhs, constraint_lhs=constraint_lhs), library_ensemble=True)

model = ps.SINDy(optimizer=sr3_ensemble, feature_names=feature_names).fit(x_train, t=dt)
model.print()
##### THIS ERRORS OUT 
######################################################################################
#### SECOND EXAMPLE RUNS FINE BUT DOESNT HAVE THE CONSTRAINTS
sr3_ensemble = ps.EnsembleOptimizer(ps.ConstrainedSR3(constraint_rhs=constraint_rhs, constraint_lhs=constraint_lhs),  bagging=True,n_subset=int(0.6*x_train.shape[0]))

model = ps.SINDy(optimizer=sr3_ensemble, feature_names=feature_names).fit(x_train, t=dt)
model.print()
akaptano commented 1 year ago

Great question -- I am not sure what you are doing is sensible though. The constraint matrices for ConstrainedSR3 are defined assuming the original size of the FULL library, while you are asking it to build constrained SR3 models using a subset of the library, so the constraints have the wrong dimensions for this sub-library! This is probably worth implementing a fix at some point, or at least a warning in the code that this is not allowed behavior.

Also, could you post the error message you are seeing?

akashyap13 commented 1 year ago

Hey @akaptano,

Thank you so much for your response. Yes I think for the sub-library it might not make sense to reduce terms when having a constraint. The error I get for that is "ValueError: cannot reshape array of size 30 into shape (3,9)".

However, for the bagging, I feel like it should work since its just running on a subset and then taking the average coefficients. Currently when I run it, it just ignoring the constraints I am passing to it and I get the solution if I didn't have any constraints.

akaptano commented 1 year ago

Okay sounds like we have diagnosed the library bagging issue at least. I can reproduce the issue you mentioned with the bagging, so this is likely a real error on our side. Will try to take a look this week.

akashyap13 commented 1 year ago

Incredible!!! Thank you so much. This would be very helpful for my code.

akashyap13 commented 1 year ago

Hey @akaptano, Just wondering if you all were able to do a quick fix on this bug or it was going to take some time longer, thanks happy holidays!

akaptano commented 1 year ago

Sorry for the very slow response here -- we have not gotten around to implementing this, although if it is important for your research, it is straightforward to edit the PySINDy python source code to do as your hoping here.

Jacob-Stevens-Haas commented 1 month ago

@himkwtn here's another slightly-mathy issue if you want a meatier issue

Jacob-Stevens-Haas commented 1 month ago

The question of how to disallow library bagging of constrained optimizers is a thorn. It can be handled in type checking if we parametrize the BaseOptimizer type based upon whether it is constrained. Here's an example of how that works

from typing import Literal, Generic, overload, TypeVar
from typing_extensions import assert_type

T = TypeVar("T", bound=bool) #Constrained flag

class Optimizer(Generic[T]):
    def __new__(cls) -> Optimizer[Literal[False]]: ...

assert_type(Optimizer(), Optimizer[Literal[False]])

class ConstrainedSR3(Optimizer[T]):
    @overload
    def __new__(cls, constraints: None) -> ConstrainedSR3[Literal[False]]: ...
    @overload
    def __new__(cls, constraints: str) -> ConstrainedSR3[Literal[True]]: ...

assert_type(ConstrainedSR3(None), ConstrainedSR3[Literal[False]])
assert_type(ConstrainedSR3("a"), ConstrainedSR3[Literal[True]])

@overload
def ensemble(opt: Optimizer[Literal[False]], bag_lib: Literal[True]) -> None: ...
@overload
def ensemble(opt: Optimizer[Literal[True]], bag_lib: Literal[False]) -> None: ...

ensemble(ConstrainedSR3(None), True)
ensemble(ConstrainedSR3("a"), True) # Error
ensemble(ConstrainedSR3("a"), False)

The alternative would be to have a subclass for Constrained optimizers, but then we couldn't handle situations when a constrained optimizer isn't being used with constraints (e.g. MIOSR).

In execution, it's a bit trickier, but we can check if the ensembled optimizer is of a constrainable type, and then check constraint_lhs. We also want to disallow the SBR optimizer, or any other that result in a distribution (e.g. ensembling an ensembler). So maybe we're at a point to split the BaseOptimizer into StatisticalOptimizer and DiscreteOptimizer, and ensembling only accepts the latter.