[BUG] SSR optimizer always picks the first model it generates

dynamicslab / pysindy

A package for the sparse identification of nonlinear dynamical systems from data

Other

1.4k stars 308 forks source link

import numpy as np import pysindy as ps from scipy.integrate import solve_ivp # a simple one-dimensional dynamical system for testing def exponential_growth(t, x): return 2 * x # variables for ivp t_sim = np.arange(0, 1, 0.1) # [0. 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9] t_span = [t_sim[0], t_sim[-1]] # [0.0, 0.9] initial_condition = [10] solution = solve_ivp(exponential_growth, t_span, initial_condition, t_eval=t_sim) # extract training data x_train = solution.y.T # (10,1) x_noisy = x_train + np.random.normal(size=x_train.shape, scale=5e-2) # initialize nonlinear library library = ps.feature_library.PolynomialLibrary(include_bias=False,degree=4) library.fit(x_noisy) # ['x0', 'x0^2'] # initialize pySINDy object with SSR optimizer opt_SSR = ps.SINDy(optimizer=ps.SSR(),feature_library=library) opt_SSR.fit(x_noisy) print("SSR coeff:" + str(opt_SSR.coefficients())) # initialize pySINDy object with STLSQ optimizer for comparison opt_STLSQ = ps.SINDy(optimizer=ps.STLSQ(),feature_library=library) opt_STLSQ.fit(x_noisy) print("STLSQ coeff:" + str(opt_STLSQ.coefficients())) # print the true coefficients true_coeff = [2,0,0,0] print("True coeff:" + str(true_coeff))

Great, thanks Nehal! I think you'll need to change the exit criteria from the loop. See if there's a way to do this elegantly that maintains the current behavior (I think the current option is AIC, and that should be made explicit).

If you go about doing a PR, a few other things that might make sense, if you have time:

combine max_iter, normalize_columns, copy_X, and unbias into **kwargs that get passed to `super().init(
Feel free to rename any variables that suck
Add static typing to any function you touch
decorate methods that don't actually use self as @staticmethod or make them functions

Just as a heads up, in your example, you need to pass t=t_sim to the SINDy.fit calls - that's why you're finding 0.2, instead of 2.0, as the coefficient. This is sort of what I was talking about with making the MWE down to the level of the optimizer, rather than the SINDy object. Here's another example, in addition to the one via email, that you might consider as a test in this PR:

import numpy as np
import pysindy as ps

x = np.zeros((10, 2))
y = np.ones((10, ))
x[:, 0] = y
x += np.random.normal(size=(10,2), scale=1e-2)
print("SSR:", ps.SSR().fit(x, y).coef_)
print("STLSQ:", ps.STLSQ().fit(x, y).coef_)

SSR: [[ 1.02626566 -0.04611932]]
STLSQ: [[1.02683949 0.        ]]

dynamicslab / pysindy

[BUG] SSR optimizer always picks the first model it generates #532

Description

Reproducing code example:

Screenshot of reproducing code output:

PySINDy/Python version information: