dynamicslab / pysindy

A package for the sparse identification of nonlinear dynamical systems from data
https://pysindy.readthedocs.io/en/latest/
Other
1.4k stars 308 forks source link

[BUG] SSR optimizer always picks the first model it generates #532

Open nehalsinghmangat opened 1 month ago

nehalsinghmangat commented 1 month ago

Description

When running the SSR algorithm, I notice that it always returns the first dynamic model it generates. This is, of course, in stark contrast to what the optimization behavior should be, as described in the paper that first introduced this method: Sparse learning of stochastic dynamical equations

Reproducing code example:

import numpy as np
import pysindy as ps
from scipy.integrate import solve_ivp

# a simple one-dimensional dynamical system for testing
def exponential_growth(t, x):
    return 2 * x

# variables for ivp
t_sim = np.arange(0, 1, 0.1)  # [0.  0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9]
t_span = [t_sim[0], t_sim[-1]] # [0.0, 0.9]
initial_condition = [10]
solution = solve_ivp(exponential_growth, t_span, initial_condition, t_eval=t_sim)

# extract training data
x_train = solution.y.T # (10,1)
x_noisy = x_train + np.random.normal(size=x_train.shape, scale=5e-2)

# initialize nonlinear library
library = ps.feature_library.PolynomialLibrary(include_bias=False,degree=4)
library.fit(x_noisy) # ['x0', 'x0^2']

# initialize pySINDy object with SSR optimizer
opt_SSR = ps.SINDy(optimizer=ps.SSR(),feature_library=library)
opt_SSR.fit(x_noisy)
print("SSR coeff:" + str(opt_SSR.coefficients()))

# initialize pySINDy object with STLSQ optimizer for comparison
opt_STLSQ = ps.SINDy(optimizer=ps.STLSQ(),feature_library=library)
opt_STLSQ.fit(x_noisy)
print("STLSQ coeff:" + str(opt_STLSQ.coefficients()))

# print the true coefficients
true_coeff = [2,0,0,0]
print("True coeff:" + str(true_coeff))

Screenshot of reproducing code output:

image

PySINDy/Python version information:

1.7.6.dev325+gf2dfe3e.d20240602 3.10.12 [GCC 11.4.0]

Jacob-Stevens-Haas commented 1 month ago

Great, thanks Nehal! I think you'll need to change the exit criteria from the loop. See if there's a way to do this elegantly that maintains the current behavior (I think the current option is AIC, and that should be made explicit).

If you go about doing a PR, a few other things that might make sense, if you have time:


Just as a heads up, in your example, you need to pass t=t_sim to the SINDy.fit calls - that's why you're finding 0.2, instead of 2.0, as the coefficient. This is sort of what I was talking about with making the MWE down to the level of the optimizer, rather than the SINDy object. Here's another example, in addition to the one via email, that you might consider as a test in this PR:

import numpy as np
import pysindy as ps

x = np.zeros((10, 2))
y = np.ones((10, ))
x[:, 0] = y
x += np.random.normal(size=(10,2), scale=1e-2)
print("SSR:", ps.SSR().fit(x, y).coef_)
print("STLSQ:", ps.STLSQ().fit(x, y).coef_)
SSR: [[ 1.02626566 -0.04611932]]
STLSQ: [[1.02683949 0.        ]]