dynamicslab / pysindy

A package for the sparse identification of nonlinear dynamical systems from data
https://pysindy.readthedocs.io/en/latest/
Other
1.46k stars 324 forks source link

Feature library does not obey the set inputs_per_library [BUG] #540

Open MuhammadUsamaSattar opened 4 months ago

MuhammadUsamaSattar commented 4 months ago

Hey! I am trying to fit a model for solenoids attracting pieces of ferromagnetic particles placed on water surface for nano/micro robotic manipulation. I have two custom libraries and one Polynomial library tensored together. Since no movement occurs without a current value, I believe, the current needs to be multiplied with each term (hence the Polynomial library). To achieve this, I am using the inputs_per_library argument for the Generalized Library. This gives the correct terms for combinations that consists of tensoring only two libraries, but for three libraries, the terms are incorrect.

Reproducing code example:

feature_data, x_test, t_data, t_test = train_test_split(feature_data, t_data, test_size=0.20, random_state=42)

differentiation_method = ps.SmoothedFiniteDifference(order=2, d=1)

library_functions_theta = [lambda x: (x*16/720)/np.sqrt((x*16/720)**2 + 4.5**2)]
library_function_names_theta = [lambda x: 'cos(theta)']
custom_library_theta = ps.CustomLibrary(library_functions=library_functions_theta, function_names=library_function_names_theta)

library_functions_negative_power = [lambda x: 1/x**1,
                                    lambda x: 1/x**2,
                                    lambda x: 1/x**3]
library_function_names_negative_power = [lambda x: '1/' + x, 
                                         lambda x: '1/' + x + '^2',
                                         lambda x: '1/' + x + '^3']
custom_library_negative_power = ps.CustomLibrary(library_functions=library_functions_negative_power, function_names=library_function_names_negative_power)

inputs_per_library = np.array([[1, 1], [1, 1], [0, 0]])
tensor_array = [[1, 0, 1], [0, 1, 1], [1, 1, 1]]

feature_library = ps.GeneralizedLibrary([custom_library_theta, custom_library_negative_power, 
                                                ps.PolynomialLibrary(degree=3, include_bias=False)], 
                                                tensor_array=tensor_array,
                                                inputs_per_library=inputs_per_library,
                                                )
optimizer = ps.STLSQ(threshold=0.1)

model = ps.SINDy(differentiation_method=differentiation_method,
                 feature_library=feature_library,
                 optimizer=optimizer,
                 feature_names=["I", "r"],
                 discrete_time=False)

model.fit(feature_data, t = t_data, multiple_trajectories = True, ensemble=False)

print(model.get_feature_names())
model.print()

print("R2 score: ", model.score(x_test, t_test, multiple_trajectories=True))

This gives me the following feature_names:

['cos(theta)', '1/r', '1/r^2', '1/r^3', 'I', 'I^2', 'I^3', 'cos(theta) I', 'cos(theta) I^2', 'cos(theta) I^3', '1/r I', '1/r I^2', '1/r I^3', '1/r^2 I', '1/r^2 I^2', '1/r^2 I^3', '1/r^3 I', '1/r^3 I^2', '1/r^3 I^3', 'cos(theta) 1/r r', 'cos(theta) 1/r r^2', 'cos(theta) 1/r r^3', 'cos(theta) 1/r^2 r', 'cos(theta) 1/r^2 r^2', 'cos(theta) 1/r^2 r^3', 'cos(theta) 1/r^3 r', 'cos(theta) 1/r^3 r^2', 'cos(theta) 1/r^3 r^3']

According to the set inputs_per_library, the last term in each term should be I/I^2/I^3 which is correct for all libraries generated by tensoring with only 2 libraries but for libraries generated with tensoring three libraries, 'r' is used instead of 'I'. You can see this issue for features from 'cos(theta) 1/r r' to 'cos(theta) 1/r^3 r^3'.

'cos(theta) 1/r^3 r^3' should be 'cos(theta) 1/r^3 I^3' instead, for example.

Jacob-Stevens-Haas commented 3 months ago

Thanks for finding this! I can recreate this on master. Here's a simpler example

from pysindy import CustomLibrary, GeneralizedLibrary, PolynomialLibrary
import numpy as np

f_list = [lambda x: x]
lib_names_a = [lambda x: f'a({x})']
lib_names_b = [lambda x: f'b({x})']
lib_names_c = [lambda x: f'c({x})']

lib_a = CustomLibrary(f_list, lib_names_a)
lib_b = CustomLibrary(f_list, lib_names_b)
lib_c = CustomLibrary(f_list, lib_names_c)

inputs_per_library = [[1], [1], [0]]
tensor_array = [[1, 1, 1]]

feature_library = GeneralizedLibrary(
    [lib_a, lib_b, lib_c],
    tensor_array=tensor_array,
    inputs_per_library=inputs_per_library,
)

feature_library.fit(np.array([[1, 2]])).get_feature_names(input_features = ["x", "y"])

Expected:

['a(y)', 'b(y)', 'c(x)', 'a(y) b(y) c(x)']

Actual:

['a(y)', 'b(y)', 'c(x)', 'a(y) b(y) c(y)']

And testing that the functions are applied as written:

feature_library.fit_transform(np.array([[1, 2]]))

Expected:

AxesArray([[ 2,  2,  1, 4]])

Actual

AxesArray([[ 2,  2,  1, 8]])