dynamicslab / pysindy

A package for the sparse identification of nonlinear dynamical systems from data
https://pysindy.readthedocs.io/en/latest/
Other
1.36k stars 304 forks source link

SINDy-PI for a system of ODEs #511

Open mathias-marques opened 1 month ago

mathias-marques commented 1 month ago

Hello, I am having some trouble using SINDy-PI. I used the following system of equations to generate data. I conducted 10 runs of this system of ODEs with different initial conditions

$$\begin{aligned} \frac{dX_v}{dt} &= (\mu - k_d) X_v \ \frac{dX_d}{dt} &= k_d Xv - K{\text{lysis}} Xd \ \frac{dG}{dt} &= \left( \frac{-\mu}{Y{xg}} - m_g \right) Xv \ \frac{dL}{dt} &= -Y{lg} \left( \frac{-\mu}{Y_{xg}} - m_g \right) Xv\ \ \ \text{With } \mu \text{ and kd equal to :} \ \mu &= \mu{\text{max}} \left( \frac{G}{K_g + G} \right) \left( \frac{K_l}{K_l + L} \right) \ kd &= k{d{\text{max}}} \left( \frac{k\mu}{\mu + k_\mu} \right) \ \end{aligned}$$

here is a plot of one of the runs :

image

After generating data from these 10 runs, my goal is to use SINDy-PI to recover the equations mentioned earlier. Below is the code utilized to achieve this task.

Reproducing code example:

import numpy as np
from scipy.integrate import solve_ivp
import matplotlib.pyplot as plt
import pysindy as ps
import random

"""
Data generation
"""

nbdata=10
# Parameters for CHO growth model
mumax  = 0.035          # h^-1
Klysis = 4*10**(-2)     # h^-1
Yxg    = 9.23*10**7     # cell/mmol
mg     = 8*10**(-13)    # mmol/cell*h 1.1*10**(-14)
Ylg    = 1.6            # ///
Kg     = 1              # mM
Kl     = 150            # mM
kdmax  = 0.01           # h^-1
kmu    = 0.01           # h^-1
parameter_cell_growth=(mumax, Klysis, Yxg, mg, Ylg, Kg, Kl, kdmax, kmu)
def CHO_growth(t, z, mumax, Klysis, Yxg, mg, Ylg, Kg, Kl, kdmax, kmu):
    Xv, Xd, G, L = z
    mu  = mumax * (G/(Kg+G)) * (Kl/(Kl+L))
    kd  = kdmax * (kmu/(mu+kmu))
    dXv = (mu-kd)*Xv
    dXd = kd*Xv - Klysis*Xd
    dG  = ((-mu/Yxg)-mg)*Xv
    dL  = -Ylg*((-mu/Yxg)-mg)*Xv 

    return [ dXv, dXd, dG, dL]

#Time
dt=1
T=180
t = np.arange(0, T + dt, dt)
t_span = (t[0], t[-1])

data=[]
timing=[]

for i in range(nbdata):

    Xvinit = random.uniform(1*10**8, 2*10**8)
    Xdinit = random.uniform(5*10**5, 5*10**6)
    Ginit  = random.uniform(20, 35) 
    Linit  = 0
    CI=[Xvinit, Xdinit, Ginit, Linit]
    sol = solve_ivp(CHO_growth, t_span, CI, t_eval=t, args=parameter_cell_growth, dense_output=True, method="Radau", max_step=1)
    data.append(sol.y.T)
    timing.append(sol.t)

"""
SINDy-PI
"""
# Initialize custom SINDy library so that we can have x_dot inside it. 
library_functions = [
    lambda x: x,
    lambda x, y: x * y,
    lambda x: x ** 2,
    lambda x, y, z: x * y * z,
    lambda x, y: x * y ** 2,
    lambda x: x ** 3,
    lambda x, y, z, w: x * y * z * w,
    lambda x, y, z: x * y * z ** 2,
    lambda x: x ** 4,
    lambda x, y: x ** 3 * y,
    lambda x, y: x ** 2 * y ** 2,
    lambda x, y: x * y ** 3,

]

# library function names includes both 
# the x_library_functions and x_dot_library_functions names
x_dot_library_functions = [lambda x: x]
function_names = [
    lambda x: x,
    lambda x, y: x + y,
    lambda x: x + x,
    lambda x, y, z: x + y + z,
    lambda x, y: x + y + y,
    lambda x: x + x + x,
    lambda x, y, z, w: x + y + z + w,
    lambda x, y, z: x + y + z + z,
    lambda x: x + x + x + x,
    lambda x, y: x + x + x + y,
    lambda x, y: x + x + y + y,
    lambda x, y: x + y + y + y,
    lambda x: x,
]
# Need to pass time base to the library so can build the x_dot library from x
sindy_library = ps.SINDyPILibrary(
    library_functions=library_functions,
    x_dot_library_functions=x_dot_library_functions,
    t=t,
    function_names=function_names,
    include_bias=False,
)

# I specify some index for the submodel [Xv, Xd, G, L, Xv_dot, Xd_dot, G_dot, L_dot]
sindy_opt = ps.SINDyPI(model_subset=[0,1,2,3,55,56,57,58],threshold=0.001,max_iter=200000,normalize_columns=True,tol=1*10**(-8))
model = ps.SINDy(
    optimizer=sindy_opt,
    feature_library=sindy_library,
    differentiation_method=ps.FiniteDifference(drop_endpoints=True),
    feature_names=["Xv", "Xd", "G", "L"],
)
model.fit(data, t=timing, multiple_trajectories=True)
model.print()

Error message:

My issue lies in SINDy's failure to retrieve my original equations. I acknowledge that some parameters are very close to zero, which may contribute to the problem. However, it also fails to identify a model that closely resembles the equations mentioned earlier. The results obtained from the code are as follows:

Xv = -0.011 G_dot + 0.007 L_dot Xd = 0.000 G = -0.004 G_dot + 0.002 L_dot L = 0.000 Xv_dot = -0.046 G_dot + 0.029 L_dot Xd_dot = -0.002 G_dot + 0.001 L_dot G_dot = -0.060 L_dot L_dot = -0.097 G_dot

And if I set the SINDy-PI optimizer flag normalize_columns to False the solver fail and set all the coef to 0.

I attempted to address the issue by reducing the order of my library, starting from a polynomial order of 6 and progressively lowering it to 4. However, this adjustment does not seem to have resolved the problem. I wanted to explore some examples of SINDy-PI usage, but I'm encountering challenges finding Python implementations for system of ODEs. Most available code examples are in MATLAB. Is there a particular reason I should consider using MATLAB code instead ?

PySINDy/Python version information:

PySINDy version : 1.7.5 Python version : 3.11.8