dynamicslab / pysindy

A package for the sparse identification of nonlinear dynamical systems from data
https://pysindy.readthedocs.io/en/latest/
Other
1.36k stars 304 forks source link

[Question] Modelling of closure models #412

Open JR-1991 opened 9 months ago

JR-1991 commented 9 months ago

Hello! I'm new to PySindy and I'm really impressed with the library so far.

I'm trying to recover a symbolic expression of a closure model that was found by a neural network. I have access to the rates for the original symbolic model $\dot{y}$, closure $\dot{y{nn}}$ and combined model $\dot{y{full}}$. My plan is to create a "biased" model in PySindy, where the objective is to find an expression that predicts the rates of the combined model $\dot{y_{full}}$ with the addition/bias of the original model rates $\dot{y}$.

I am not sure how to proceed with this approach as I don't have much experience using PySindy. Could you please let me know if there is a way or a feature library that I can use to incorporate these rates as zero-order terms? I have already looked into the identity library, but it hasn't helped me yet.

Thank you in advance for your assistance!

Jacob-Stevens-Haas commented 9 months ago

Hey JR, thanks for your question! Using pysindy to get an explainable counterpart to a NN is a cool use case! I am not super familiar with closure models, so it would help to share how you expect $\dot y$, $\dot y{nn}$, and $\dot y{full}$ to be related. Either in an equation, a description of what quantities they represent, or both. As a case in point, does "rate" refer to the underlying quantity $y$ or the derivative $\dot y$? I'm also unsure, when you say "rates for the original symbolic model", do you know the coefficients for that model and you have simulation data, or are you looking for coefficients and you merely have data?

IIUC, you're looking for the $f$ in $\dot y{full} = f(y{full}, y)$ with (a) $y{full} = y + y{nn}$ (b) known $\dot y=g(y)$, and (c) data for both $y{full}$ and $y$. If so, there's a few ways you can go about this: 1) Train on $y{full}$ and use $y$ as a control input. 2) Train on $y_{full}$ but add constraints to the known terms from $\dot y$ (See ConstrainedSR3 optimizer) 3) In general, as a NN is a function approximation and you're trying to use pysindy to fit an approximation of that approximation, could you not train pysindy directly on the data used in the NN training?

JR-1991 commented 9 months ago

Hi @Jacob-Stevens-Haas, thanks for the quick answer! Happy to elaborate the use case.

I've created a theoretical model of a chemo-enzymatic reaction network and used conventional fitting algorithms to achieve a good fit. However, there are still some residuals to the experimental data that I've addressed by modeling them with a neural network. The neural network has discovered something the theoretical model needs to include. My objective now is to use PySindy to recover these relationships symbolically. The closure model can be expressed as follows:

$$\frac{dy}{dt} = f(\theta, y, t) + f_{NN}(\zeta, y, t)$$

where I want to symbolically regress $f_{NN}$ and learn new relations. My approach was to supply $f(\theta, y, t)$ as a static zero-order term and use the $dy/dt$ as x_dot such that PySindy can figure out the residual $f_{NN}(\zeta, y, t)$. I hope that helped in explaining the use case.

In general, as a NN is a function approximation and you're trying to use pysindy to fit an approximation of that approximation, could you not train pysindy directly on the data used in the NN training?

The theoretical model includes latent states, which are not observable but can be expressed in terms of the features using Steady-State assumptions. I tried using PySindy directly, but the outcomes did not make physical sense or were hard to interpret. Thus, my thought was to stay with the model, which has a good fit already, and use the NN to find out what's missing as I imagine the results are less complex.

Thanks for the other suggestions! I tested the control input, but it seems that it is distributed across all features. For instance, I receive something like x0'(t) = 0.2.u0 + 0.1 u1 + 0.01 x0. However, I would like to impose a constraint on the process, allowing only one control input per feature. This would be a desired output:

x0'(t) = 0.2 u0 + 0.01 x0
x1'(t) = 0.2 u1 + 0.01 x0

Is it possible to use the 'ConstrainedS3' optimizer to achieve this? I have only found documentation for the input vector thus far and would appreciate learning how to extend it to the control input.

Thanks for the help already! :-)

Jacob-Stevens-Haas commented 9 months ago

Applying control additively

You can use a GeneralizedLibrary to combine libraries and only apply the libraries to certain features. To wit, if

Constraining control feature coefficients

ConstrainedSR3 uses constraint_lhs and constraint_rhs matrices to constrain terms with inequality constraints. e.g., in the equations $$\dot x1=\Theta{11} f_1(x1) + \Theta{12} f_2(x_2),~~\newline \dot x2=\Theta{21} f_1(x1) + \Theta{22} f_2(x2),$$ If you know $\Theta{11}=.1$ and $\Theta{22} = 2$, you would have WIP, posting to save since my laptop is dying