dynamicslab / pysindy

A package for the sparse identification of nonlinear dynamical systems from data
https://pysindy.readthedocs.io/en/latest/
Other
1.46k stars 324 forks source link

Modeling Issues with Using Ensemble Methods #520

Open oskarmue opened 5 months ago

oskarmue commented 5 months ago

At the moment I am trying to model the magnetic force of a proportional magnet. The data we record on our test bench are the electric current i, the electric voltage u, the position of the armature x and the magnetic force F. In later use of the model, i, u and x will also be known measured variables, which is why I have included them in pySINDy under the control inputs. We have recorded 6 different data sets, these differ in the rate of change of current and in one test data set the magnet is excited with a triangular current instead of a sinusoidal one.

The force is not accurately detected when excited with the new dynamics of the current. Due to the poor genralization of the model, I assume that I have overfitted the model. Since the reduction of the data set did not yield an improved result, I wanted to test the ensemble methods provided. This shows a pattern where I am not sure if this is correct or if I am using this method incorrectly.

When I call model.fit() and then simulate with the model, the model seems to be stable like the previous models, but also overfitted. If I then test the other models that are stored in the coef_list.

Then every single one of them is unstable/much less accurate. Since ensemble methods only use about 60% of the data for training, I have enlarged the training data set. However, this has not led to any improvement either. The code looks like this:

opt = ps.SR3(
    threshold=0.00001,
    thresholder = 'L2', #CAD, L0, L1, L2
    #normalize_columns = True,
)
model = ps.SINDy(
    optimizer = opt,
    feature_library = poly_lib,
    feature_names = feature_names,
    discrete_time = True
)
model.fit(
    x = features_train,
    u = controls_train,
    t = data_time_train,
    ensemble = True,
    n_models = 20,
    #replace = False,
)

def integration_metric(coef_list, optimizer, arrays):
    for i in range(np.shape(coef_list)[0]):
        #optimizer.coef_ = coef_list[i, :, :]
        try:
            model.coefficients()[0,:] = coef_list[i][0]
            x_test_sim = model.simulate([0], 6000, u = controls_test[36000:], integrator="odeint")
            arrays.append(x_test_sim)
            if np.any(np.abs(x_test_sim) > 5000):
                print('unstable model!')
                coef_list[i, :, :] = 0.0
        except:
            print('nope')
    return coef_list, arrays

stable_ensemble_coefs, liste = integration_metric(
    np.asarray(model.coef_list), opt, arrays
)
Jacob-Stevens-Haas commented 5 months ago

There is no guarantee that generic SINDy discovers stable models. Look into TrappingSINDy.

Also, for these "how do I get better results from this problem" questions, it helps to have a little bit of LaTeX to describe the equations you're simulating/trying to discover.