Error of big-sparsity parameter

yewalenikhil65 commented 4 years ago

I have x_train array of 801 x 7 (7 state variables, 801 time instances) , and same is shape of Derivate Dx_train array. I am facing following error of big-sparsity parameter error even with so small value of threshold, and also related to ill-conditioned matrix. I am new to pySINDy. Requesting help to understand the error in more detail.

briandesilva commented 4 years ago

Requesting help to understand the error in more detail.

The UserWarning is telling you that some of the learned equations are all 0's. With the STLSQ optimizer, the threshold you set will determine the smallest coefficient that is retained. You probably shouldn't need to set it as low as 0.00005. See the suggestions below for ways of dealing with this issue.

The LinAlgWarning is saying that the matrix of library functions, Theta(X) probably has columns that are very close to linearly dependent. This can occur if your state variables are highly correlated or can be expressed as polynomial functions of other state variables (since you're using a polynomial library). I've seen the latter come up when the exact solution of the underlying differential equations involved sines and cosines due to trig identities (e.g. if x1(t) = sin(t) and x2(t) = cos(t) are solutions, then x1^2 + x2^2 = 1, so the columns of Theta(X) corresponding to 1, x1, and x2, will be linearly dependent).

A few things worth trying:

Normalize the features in the feature library with ps.STLSQ(normalize=True). Different scales in input features can affect the relative sizes of coefficients.
For the poor conditioning problem: increase regularization strength, alpha, with ps.STLSQ(alpha=0.1). You can try different versions of alpha and see what works best.
Experiment with different optimizers: SR3 or maybe a Scikit-learn method.
You may need to tweak the way you're computing Dx_train. If the data are noisy, be sure to use a differentiation method that is robust to noise.

yewalenikhil65 commented 4 years ago

Dx_train derivatives of state-variables are obtained from the ode system-model(which is stiff) itself by plugging in the solution and evaluating the RHS. I am trying to derive the ode-system by SINDy again, however am unable to do so. I notice that indeed there are some highly co-related state variables in my x_train (as they have nearly same qualitative variation with time). I have 7 states in x_train = [x1(t),x2(t),x3(t),x4(t),x5(t),x6(t),x7(t)] Following code, I guess includes all combinations of xy from 7 state variables ( some of which are again co-related).

library_functions = [
    lambda x, : x,
    lambda x,y : x*y,
]
library_function_names = [
    lambda x, : x,
    lambda x,y : '(' + x + '*' + y + ')',
]

But suppose of all combinations I wish to specifically select only 2 or 3 combinations such as x2*x3 or x3*x4 which are not much co-related. How I do proceed further to avoid including all the combinations ?

briandesilva commented 4 years ago

But suppose of all combinations I wish to specifically select only 2 or 3 combinations such as x2*x3 or x3*x4 which are not much co-related. How I do proceed further to avoid including all the combinations ?

Did anything I suggested help at all? Increasing the regularization parameter, alpha, should help. If the system is stiff, it is likely that there are at least two different timescales coming into play, so you may be able to improve your results by using a non-uniform time sampling scheme. See this paper for further information. You may also be able to get away with simply increasing the sampling rate.

yewalenikhil65 commented 4 years ago

Hello sir, Thanks for pointing me to the paper, I will go through it asap. Also will check the effect of increasing sampling rate.

Based on the earlier suggestions, here is what I found.

Increasing alpha and normalization=True helped in spitting out some non-zero terms inaccurately for couple of state-variables, but the other state-variables still remain zero.
SR3 (and also STLSQ with large alpha )does not give any error or warning, but gives results that are not explainable physically(terms printed by the model are not possible).
Lasso also helped in spitting out some non-zero terms inaccurately for couple of state-variables, but it also gives warning of increasing max iterations even after setting it to 2000

Here is the explanation of the model, [x1(t), x2(t),...,x7(t)] are the state-variables obtained from a high-dimensional(many state-variables) chemical-reactions full model which has even more number of state-variables. It has been proved in literature that a reduced-order model can be constructed from using only these 7 state variables, and the equivalent reduced order model of only 4 parameters is given below (which only approximately re-presents actual full model)

[k1, k2, k3, k4] = [6.93938e+20 , 3.49444e+03 ,1.48266e+08, 2.82183e-02] 
# inaccurate parameters for reduced order model, obtained from optimization methods like curve fitting

[x1 ,x2,x3,x4,x5,x6,x7] @(t =0) = [100e-12, 0.0, 1.6e-07 , 2.0e-08 , 0.0 , 1.4e-06, 0.0 ] 
# units (M)

t_span = 0:800;   # time span for ODE system ,. 800s

x1' = −k1 (x1 x2 x3 x4)
x2' = −k1 (x1 x2 x3 x4) + k2 (x1 x3 x6) + k3 (x5 x6) − k4 x2
x3' = −k1 (x1 x2 x3 x4)
x4' = −k1 (x1 x2 x3 x4)
x5' = k1 (x1 x2 x3 x4)
x6' = −k2(x1 x3 x6) − k3 (x5 x6)
x7' = k4 x2

I was trying to prove these above equations hold using SINDy, which might spit out more accurate values of parameter [k1,k2,k3,k4] for which the reduced-order model accurately represents the full model.

briandesilva commented 4 years ago

Thanks for the detailed explanation. A couple of things stand out to me here:

Many of the terms you're hoping to learn involve fourth degree terms, so you should use a polynomial library of degree four.
I think you will see a good deal of improvement if you can increase the sampling rate. Even using a naive uniform time sampling strategy, you should be able to get better results. I suspect that one second between samples might be too long for PySINDy to compute an accurate derivative.
If the coefficients k1 - k4 really do span 22 orders of magnitude, SINDy is going to have a hard time picking up the smaller ones. You might consider trying the ConstrainedSR3 optimizer on the SR3Enhanced_variableThresholding branch with custom thresholds for different library terms (#78). This should allow you to set, for example, a small threshold for the x2 term, which you expect to have a smaller coefficient value and a larger threshold for, say, (x1 x2 x3 x4).
Sometimes Orthogonal matching pursuit gives better results than LASSO.

yewalenikhil65 commented 4 years ago

Hi, Polynomial library to degree four was used, But I was unable to get the correct result. It really was because of parameter k spanning many orders of magnitude. Increasing sampling time did no help. I also tried to tune the constrained SR3 optimizing for different library terms, but was unable to recover the equations.

Recently, I have come across a nice paper that does tackle this issue of parameters spanning orders of magnitude(as in chemical reactions networks). The key is to scale the linear system before minimizing it. Attaching the link here for the reference. Will check how this fares with SINDy Rapid data-driven model reduction of nonlinear dynamical systems including chemical reaction networks using ℓ1-regularization https://aip.scitation.org/doi/10.1063/1.5139463

briandesilva commented 4 years ago

Thanks for the reference. Setting normalization=True should rescale each of the columns, but you might have better luck doing the rescaling by hand via the approach in the paper you linked. I'm curious to hear whether it works!

dynamicslab / pysindy

Error of big-sparsity parameter #93