dynamicslab / pysindy

A package for the sparse identification of nonlinear dynamical systems from data
https://pysindy.readthedocs.io/en/latest/
Other
1.4k stars 308 forks source link

Implementing Autograd instead of numeric differentiation methods #268

Closed juanit0o closed 1 year ago

juanit0o commented 1 year ago

I'm doing some tests with SINDy with both first and second order derivatives and fortunately, applying SINDy with Weak Library to discover 1st order PDEs works perfectly. However, I am having a lot of difficulties (impossible until now) on finding the correct 2nd order derivatives for the data I'm using. Due to my understanding, I believe this might be related to the fact that SINDy only uses numeric based differentiation methods.

Firstly, I tried the various numerical differentiation methods (spectral, smoothed finite diff, spline..) that are available with SINDy but the results didn't show any particular improvements. My idea was to implement another differentiation method using autograd. However, when trying to do so, I'm confused because the _differentiate method is never called (for any of the methods) and I'm not sure how can I do so.

I have already added to __init__.py the new AutogradDerivative class (which extends BaseDifferentiation) and implemented the _differentiate method as obliged but as it isn't being called anywhere, there isn't much use to this. I have also looked into the code of weak_pde_library.py where I noticed it was calling FiniteDifference and tried to change it to my newly created class but nothing happened.

I'm not sure if this is the best place to ask this but I've searched everywhere and didn't find anything that could help. I would be greatly appreciated with any help on how to implement this new differentiation_method. Thank you.

akaptano commented 1 year ago

Great question. A few points:

  1. You are correct about needing numeric based differentiation methods, although I believe there is no problem using the weak form for second-order derivative terms. If this is fine for your application, we can help with that.
  2. I believe _differentiate is called implicitly when the sklearn pipeline is invoked. You are correct that this makes it impossible to use autograd, as explained below.

I think there is a bigger issue with trying Autograd -- I think you need to do SINDy symbolically so that you can compute all the derivatives symbolically. So you would need to build a fair bit of extra functionality, including symbolic libraries rather than numeric ones. Kadierdan Kaheman has done some work using symbolic SINDy. This would be really cool but might be a lot of work.

juanit0o commented 1 year ago

First of all, thank you so much for your time answering, it really is something that has been bugging me for quite a bit of time. Considering that SINDy with the weak form should be able to find second order PDEs with minimal problems, my use case doesn't really need autograd :). I will try to be as concise as possible on explaining my problem and what I have done to try to solve it to see if there is anything I'm doing wrong which might affect negatively the results.

I'm working with a dataset with the positions and velocities of sattelites that consists of 6 columns (3 for the position in X,Y,Z and 3 for the velocities in X,Y,Z) and an extra column with the corresponding times where it has been measured. This dataset has 537 measurements (1 measurement per day).

Here is what I tried to do. Due to knowing the terms that will have to be on the equations, I defined the function library with the appropriate candidate functions. I've read all of the optimization-related examples on this github as well as the pysindy-doc website and tried to apply all that would make sense here with this kind of data as well as tinkering with their respective hyperparameters but unfortunately it didn't result in anything. Given the code bellow, am I possibly making something wrong that might affect the results? Considering I'm already using basically "perfect" data generated by odeint, I don't really understand how is SINDy not able to identify those equations. I'm sorry for the wall of text and I appreciate any help, thank you once again for your attention.

Generating "artificial" data with odeint


def functionToIntegrate(valuesList, time):
    posX = valuesList[0]
    posY = valuesList[1]
    posZ = valuesList[2]
    velX = valuesList[3]
    velY = valuesList[4]
    velZ = valuesList[5]
    standardGravitationalParameter = 3.986004418 * 10**(14)
    return [velX, 
            velY,
            velZ, 
            -standardGravitationalParameter * posX/(posX**2 + posY**2 + posZ**2)**(3/2), 
            -standardGravitationalParameter * posY/(posX**2 + posY**2 + posZ**2)**(3/2), 
            -standardGravitationalParameter * posZ/(posX**2 + posY**2 + posZ**2)**(3/2)]

#first line of the dataset (positions and velocities)
intialState = dataSindy.iloc[0].to_list()

#Time generated from the first timestamp onwards (during 3 hours)
time = t_train.copy().flatten().tolist()[0] + np.linspace(0, 10800, 573)

#rows = timesteps
#columns = 6 variables
sol = odeint(functionToIntegrate, y0=intialState, t = time, tfirst=False)

Feeding that artificial data to weak SINDy

#positionX, positionY, positionZ (573 rows)
dataSindy = sol.copy()[:,:3]

#time for each measurement
t_train = time.copy()

############## Functions ################                
library_functions = [lambda a, b, c : a/(a**2 + b**2 + c**2)**(3/2),
                    lambda a, b, c : b/(a**2 + b**2 + c**2)**(3/2),
                    lambda a, b, c : c/(a**2 + b**2 + c**2)**(3/2)]

library_function_names = [lambda a, b, c: a + "/((" + a + "**2 + " + b + "**2 + " + c + "**2)**(3/2))",
                        lambda a, b, c: b + "/((" + a + "**2 + " + b + "**2 + " + c + "**2)**(3/2))",
                        lambda a, b, c: c + "/((" + a + "**2 + " + b + "**2 + " + c + "**2)**(3/2))"]

#####################################
pde_lib = ps.WeakPDELibrary(
    library_functions=library_functions,
    function_names=library_function_names,
    spatiotemporal_grid=t_train,
    derivative_order=2,
    is_uniform=True,
    K=1000
)

############# Optimizers ################
optimizerTest = ps.SSR() 
model2nd = ps.SINDy(feature_names=["posX", "posY", "posZ"], optimizer= optimizerTest, 
                feature_library=pde_lib)
model2nd.fit(dataSindy, t=t_train, ensemble=True, quiet = True)
model2nd.print()
OUTPUT
(posX)' = 14754721393892263251450789888.000 posX/((posX**2 + posY**2 + posZ**2)**(3/2)) + -20511467901215643540140326912.000 posY/((posX**2 + posY**2 + posZ**2)**(3/2)) + -3554508483856439226157498368.000 posZ/((posX**2 + posY**2 + posZ**2)**(3/2))
(posY)' = 11399607816129243415742251008.000 posX/((posX**2 + posY**2 + posZ**2)**(3/2)) + -15847313111951032907328389120.000 posY/((posX**2 + posY**2 + posZ**2)**(3/2)) + -2746239770531404375132209152.000 posZ/((posX**2 + posY**2 + posZ**2)**(3/2))
(posZ)' = -4535334905196806730346921984.000 posX/((posX**2 + posY**2 + posZ**2)**(3/2)) + 6304854822750467758405713920.000 posY/((posX**2 + posY**2 + posZ**2)**(3/2)) + 1092591718067720505161416704.000 posZ/((posX**2 + posY**2 + posZ**2)**(3/2))

EXPECTED OUTPUT
(velX)' = -398600441595072.188 posX/((posX**2 + posY**2 + posZ**2)**(3/2))
(velY)' = -398600440967606.688 posY/((posX**2 + posY**2 + posZ**2)**(3/2))
(velZ)' = -398600441014742.500 posZ/((posX**2 + posY**2 + posZ**2)**(3/2))
akaptano commented 1 year ago

Going to take some time to work through this, but first error I think is that "derivative_order=2" means that the library will have derivative terms (integrated into the weak formulation) up to second order (which I don't think you are interested in), NOT that you are trying to fit a second order differential equation! To use second-order derivatives on the left-hand-side of the SINDy fit, you need to pass them during "model2nd.fit(dataSindy, x_dot=dataSINDy_second_derivative, t=t_train, ensemble=True, quiet = True)." I recommend trying this first with the regular CustomLibrary before trying it with the WeakPDELibrary, if this is perfect data, since the WeakPDELibrary solution is more difficult and you don't seem to be using derivative terms in your equations.

This is because I believe (though I'm not sure) you will need to pass the weak-form version of the second derivative to model.fit, which must be computed and passed to model.fit as the 'x_dot' parameter. Unfortunately, we need @znicolaou for this because I'm not entirely sure how to compute higher order weak form terms in the new weak form notation we have been using.

Another thing to consider: the gravitational constant is ginormous, so large that it might be affecting numerical precision in various places. I recommend rescaling this toy example so that it is equal to 1 or something.

znicolaou commented 1 year ago

Hey @juanit0o, just to chime in quickly now, I agree with @akaptano that you ought to try the CustomLibrary first, if only just to learn about how to implement things before trying the WeakPDELibrary. It won't be very hard to do the second-order model in either case, but you may as well try simpler first.

That being said, there is both a general approach and a specialized approach for the WeakPDELibrary.

For the general approach, you need to augment your system with auxiliary variables, as one does with momentum say in Hamiltonian mechanics. So if your data is in a matrix $m\times n$ matrix $X$ (with $m$ time samples and $n$ coordinates) and you expect a model like $\ddot{X} = \Theta \cdot X$, then define a new $m \times 2n$ matrix $Y=(X,\dot{X})$, and look for a model $\dot{Y}=\Psi \cdot Y$. The first $n$ equations in the SINDy model should be trivial $\dot{Y} i = Y {2i}$, and the last $n$ will encode the second-order model.

For the specific solution, you can use implicit_terms=True in the WeakPDELibrary, which will then include second-order temporal derivatives when you use derivative_order=2. That solution is more technical and there may be pitfalls, so be careful in interpreting results if you try it.

We can give more detailed feedback if you have trouble implementing these approaches.

Jacob-Stevens-Haas commented 1 year ago

For what it's worth, _differentiate is called here

Basically, it allows any object inheriting BaseDifferentiation to be used as a function.

One reason WeakPDELibrary sticks to finite difference for spatial derivatives rather than other differentiators is that most other methods smooth the coordinates implicitly. Smoothing in multiple directions breaks the assumptions of the derivative method.

In general, consider adding a new differentiation class if you have an example of a data trajectory - leaving out the SINDy part - that the existing derivative methods fail on.

juanit0o commented 1 year ago

Thank you all so much. I was really mistaken on the meaning of the derivative_order argument in the library used. I also understand the difficulty of calculating the second order derivative with its weak form to pass it as an argument for the fit function. I will try to see if I can discover the equations I want with implicit_terms as True. Concerning that _differentiate call, I now understand that it didn't work with the WeakPDE because the finite difference _differentiate is always called but using the CustomLibrary I should be able to invoke a new differentiation class I define. Concerning the high order magnitude coefficient, I will also try to reduce its magnitude for testing purposes. I will try to implement it but once again thank you for your help and guidance.

juanit0o commented 1 year ago

Trying to calculate the second order derivative I'm facing an error that I'm not sure where it might arise from. Making use of the x_dot argument (https://pysindy.readthedocs.io/en/latest/tips.html#numerical-differentiation) in the fit() function, I am getting a shape error that I'm not sure where it might come from as the shapes seem correct. How is it possible to overcome this problem? Thank you once again.


pde_lib2ndDeriv = ps.WeakPDELibrary(
    library_functions=library_functions2ndDeriv,
    function_names=library_function_names2ndDeriv,
    spatiotemporal_grid=time,
    derivative_order=2,
    implicit_terms=True,
    is_uniform=True,
    K=1000
)

optimizerTest = ps.SSR()
model2nd = ps.SINDy(feature_names=["posX", "posY", "posZ"], optimizer= optimizerTest, feature_library=pde_lib2ndDeriv)
secondOrderDeriv = ps.FiniteDifference(order=1)._differentiate(firstOrderDeriv, t=time)

#fit raises error: "operands could not be broadcast together with shapes (1000,) (573,) "
#but np.shape(dataSindy) = (573, 3), np.shape(secondOrderDeriv) = (573, 3), np.shape(time) = (573, )
# I dont understand where 1000 comes from

model2nd.fit(dataSindy, t=time, ensemble=True, quiet = True, x_dot = secondOrderDeriv)
model2nd.print()

I also noticed that there is a function for simulating with the equations given by SINDy which can be used directly with odeint but applying it to the first derivative case where everything is correct I'm getting an index out bounds error.

optimizerKms = ps.FROLS(kappa=1e-29)
model = ps.SINDy(feature_names=["posX", "posY", "posZ", "velX", "velY", "velZ"], optimizer= optimizerKms, feature_library=pde_lib1stDeriv)
model.fit(sol1stDerivativeKm, t=time, ensemble=True, quiet=True)
model.print()

#OUTPUT
#(posX)' = 1.000 *velX
#(posY)' = 1.000 *velY
#(posZ)' = 1.000 *velZ
#(velX)' = -398600.442 *posX/((posX**2 + posY**2 + posZ**2)**(3/2))
#(velY)' = -398600.441 *posY/((posX**2 + posY**2 + posZ**2)**(3/2))
#(velZ)' = -398600.441 *posZ/((posX**2 + posY**2 + posZ**2)**(3/2))

#ERROR
#--> 842             self.x_k = [x[np.ix_(*self.inds_k[k])] for k in range(self.K)]
#    843 
#    844             # library function terms
#IndexError: index 280 is out of bounds for axis 0 with size 1
model.simulate(x0=sol1stDerivativeKm[0], t=time, integrator='odeint')

#x0 = [-9.06200977e+02 -1.80616337e+03  6.66092629e+03 -6.09630503e+00 -4.05100447e+00 -1.92916516e+00]
#time = float array of times with length 573
Jacob-Stevens-Haas commented 1 year ago

Can you provide the full traceback?

Also as a hint: if you're worried about conciseness & posting long messages, there's a spoiler tag that may be helpful:


<details>
<summary>
Title
</summary>

Hidden long traceback (can be formatted with backticks, but need newlines beforehand)
</details>

Appears as:

Title Hidden long traceback (can be formatted with backticks, but need newlines beforehand)
juanit0o commented 1 year ago

Surely. I am not sure why the problem arises but if I remove the feature library (from using weak to not using that extra argument) the code works (although giving unpleasant equations) so it has something to do with the weak library. Both the .simulate and extra x_dot argument errors disappear when I don't use the feature library argument. Below is the traceback for when I try to use the x_dot argument. Thank you.

```python pde_lib1stDeriv = ps.WeakPDELibrary( library_functions=library_functions1stDeriv, function_names=library_function_names1stDeriv, spatiotemporal_grid=time, derivative_order=1, is_uniform=True, K=500) optimizerTest = ps.SSR() #raises error model3rd = ps.SINDy(feature_names=["posX", "posY", "posZ", "velX", "velY", "velZ"], optimizer= optimizerTest, feature_library=pde_lib1stDeriv) #does not raise error #model3rd = ps.SINDy(feature_names=["posX", "posY", "posZ", "velX", "velY", "velZ"], optimizer= optimizerTest) model3rd.fit(sol1stDerivative, t=time, ensemble=True, quiet = True, x_dot=np.array(x_dotCalculated)) model3rd.print() ```
```python Output exceeds the [size limit](command:workbench.action.openSettings?[). Open the full output data [in a text editor](command:workbench.action.openLargeOutput?3a374387-e6d0-4e96-8146-456279b942eb) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in 77 optimizerTest = ps.SSR() 78 model3rd = ps.SINDy(feature_names=["posX", "posY", "posZ", "velX", "velY", "velZ"], optimizer= optimizerTest, feature_library=pde_lib1stDeriv) ---> 79 model3rd.fit(sol1stDerivative, t=time, ensemble=True, quiet = True, x_dot=np.array(x_dotCalculated)) 80 model3rd.print() 81 # data3rdSimulated = model3rd.simulate(sol1stDerivative.iloc[0].to_numpy(), t=time) s:\Anaconda\Anaconda\lib\site-packages\pysindy\pysindy.py in fit(self, x, t, x_dot, u, multiple_trajectories, unbias, quiet, ensemble, library_ensemble, replace, n_candidates_to_drop, n_subset, n_models, ensemble_aggregator) 412 warnings.filterwarnings(action, category=LinAlgWarning) 413 warnings.filterwarnings(action, category=UserWarning) --> 414 self.model.fit(x, x_dot) 415 416 # New version of sklearn changes attribute name s:\Anaconda\Anaconda\lib\site-packages\sklearn\pipeline.py in fit(self, X, y, **fit_params) 333 if self._final_estimator != 'passthrough': 334 fit_params_last_step = fit_params_steps[self.steps[-1][0]] --> 335 self._final_estimator.fit(Xt, y, **fit_params_last_step) 336 337 return self s:\Anaconda\Anaconda\lib\site-packages\pysindy\optimizers\sindy_optimizer.py in fit(self, x, y) 53 def fit(self, x, y): 54 ---> 55 x, y = drop_nan_samples( 56 AxesArray(x, {"ax_sample": 0, "ax_coord": 1}), 57 AxesArray(y, {"ax_sample": 0, "ax_coord": 1}), s:\Anaconda\Anaconda\lib\site-packages\pysindy\utils\base.py in drop_nan_samples(x, y) 127 x_good_samples = (~np.isnan(x)).any(axis=x_non_sample_axes) 128 y_good_samples = (~np.isnan(y)).any(axis=y_non_sample_axes) --> 129 good_sample_ind = np.nonzero(x_good_samples & y_good_samples)[0] 130 x = x.take(good_sample_ind, axis=x.ax_sample) 131 y = y.take(good_sample_ind, axis=y.ax_sample) s:\Anaconda\Anaconda\lib\site-packages\numpy\lib\mixins.py in func(self, other) 19 if _disables_array_ufunc(other): 20 return NotImplemented ---> 21 return ufunc(self, other) 22 func.__name__ = '__{}__'.format(name) 23 return func s:\Anaconda\Anaconda\lib\site-packages\pysindy\utils\axes.py in __array_ufunc__(self, ufunc, method, out, *inputs, **kwargs) ... ---> 83 results = super().__array_ufunc__(ufunc, method, *args, **kwargs) 84 if results is NotImplemented: 85 return NotImplemented ValueError: operands could not be broadcast together with shapes (500,) (573,) ```
znicolaou commented 1 year ago

Hi @juanit0o, I think you need to refer to some of the publications on the weak form, such as Reinbold, Patrick AK, Daniel R. Gurevich, and Roman O. Grigoriev. "Using noisy or incomplete data to discover models of spatiotemporal dynamics." Physical Review E 101, no. 1 (2020): 010203. The reason you get shape errors is because in the weak formulation, the supplied x_dot needs to be calculated as the weak features, e.g. using the calc_trajectory function from a WeakPDELibrary object. The weak features are integrals of the input data over K randomly placed spatiotemporal subdomains, so there are only K rows in the weak features rather than the number of spatiotemporal points in the differential formulation. If you really delve into the details of the library, you can edit the convert_u_dot_integral function to calculate second-order temporal derivative weak feature if you want, but you'll have to understand how the weak formulation works first.

Jacob-Stevens-Haas commented 1 year ago

Thanks for posting the complete traceback - it helped knowing which function was raising the error. I also made a mistake - the summary tags should be inside the details tag.