dynamicslab / pysindy

A package for the sparse identification of nonlinear dynamical systems from data
https://pysindy.readthedocs.io/en/latest/
Other
1.42k stars 310 forks source link

a question on how to modeling time dependent ODE #373

Closed penguinaugustus closed 1 year ago

penguinaugustus commented 1 year ago

Hi, i have a question on whether sindy can identify the following ode:

image

where its parameter is a step function that depends on time. I am thinking about perhaps we can approximate the step function with a Hill function. Then we can adapt our function to:

image

Here the larger n is, the closer the approximation would compared with the original function. To solve the problem that time can't exist explicitly on the right hand side of Sindy, i introduce another variable x, which equals to t. then my ground model become:

image

Then i can try to identify the ode sets with sindyPI. To test my idea, i start with n = 5: i generate data with ground model:

image image

which looks like the blue line in the picture:

image

Then i create a PDE library in this way:

library_functions = [
    lambda x: x,
    lambda x, y: x * y,
    lambda x: x ** 5,
    lambda x, y: y * x ** 5,
]

library_function_names = [
    lambda x: x,
    lambda x, y: x +"*" + y,
    lambda x: x + "^5",
    lambda x, y: y +"*"+ x + "^5",
]

sindy_library = ps.PDELibrary(
    library_functions=library_functions,
    temporal_grid=t,
    function_names=library_function_names,
    include_bias=True,
    implicit_terms=True,
    derivative_order=1,
)

in this library, our model can be directly expressed in this way:

image

Then I run the model:

sindy_opt = ps.SINDyPI(
    threshold=1e-6,
    tol=1e-8,
    thresholder="l1",
    max_iter=20000,
    normalize_columns=True,
)

model = ps.SINDy(
    optimizer=sindy_opt,
    feature_library=sindy_library,
    differentiation_method=ps.FiniteDifference(drop_endpoints=True),
    feature_names=["m","x"],
)
model.fit(X, t=t)
model.print()
sindy_library.get_feature_names()

and get the following results

Model  0
Model  1
Model  2
Model  3
Model  4
Model  5
/Users/yufu/opt/anaconda3/lib/python3.9/site-packages/cvxpy/problems/problem.py:1387: UserWarning: Solution may be inaccurate. Try another solver, adjusting the solver settings, or solve with verbose=True for more information.
  warnings.warn(
Model  6
Model  7
Model  8
Model  9
Model  10
Model  11
Model  12
Model  13
Model  14
Model  15
Model  16
Model  17
Model  18
Model  19
Model  20
1 = 0.085 x_t
m = 0.006 m^5m_t
x = 0.002 m^5x_t
m*x = 0.000
m^5 = 0.000
x^5 = 0.000
x*m^5 = 0.000
m_t = 0.052 mm_t + 0.012 mx_t + -0.002 xm_t
x_t = 0.085 1
mm_t = 0.257 m_t + -0.005 mx_t + 0.001 xm_t
mx_t = 0.321 m_t + -0.032 mm_t + 0.001 xm_t
xm_t = -0.212 m_t + 0.023 mm_t + 0.007 mx_t
xx_t = 0.079 m_t + -0.016 mx_t + 0.001 xm_t
m*xm_t = -0.005 1 + -0.012 m_t + -0.005 x_t + -0.003 mx_t
m*xx_t = -0.006 1 + 0.061 m_t + -0.006 x_t + -0.014 mm_t + 0.006 mx_t
m^5m_t = 0.006 m
m^5x_t = 0.002 x
x^5m_t = 0.000
x^5x_t = 0.000
x*m^5m_t = 0.000
x*m^5x_t = 0.000

It seems that the result is not ideal because model one is supposed to be 1 = x_t. The same thing happens after i increase n to 10. Is it because i use normalize_columns=True, or because the polynomial exponent shouldn't be that large?

Sorry for writing such a lengthy question! Any kind of instruction is appreciated. Here is a minimum working example:

import numpy as np
import matplotlib.pyplot as plt
import pysindy as ps
import gurobipy
from scipy.integrate import odeint
from scipy.integrate import solve_ivp

# Define the equations
def equations2(y, t, kt, kd, n, tr, tD):
    m, x = y
    dm_dt = kt + kt*(t**n / (tr**n + t**n)) - kd * m
    #dm_dt = kt  - kd * m
    dx_dt = 1
    return [dm_dt, dx_dt]

kt, kd, tD, tr = 1.26, 0.126, 70, 27

n = 5

m0 = (kt / kd) * (1 - (np.exp(-kd * (tD - tr))) / (2 - np.exp(-kd * tD)))
x0 = 0.0
y0 = [m0, x0]

t = np.linspace(0, 70, 140)

sol = odeint(equations2, y0, t, args=(kt, kd, n, tr, tD))
m = sol[:, 0]
x = sol[:, 1]

X = np.stack((m,t), axis=-1)

library_functions = [
    lambda x: x,
    lambda x, y: x * y,
    lambda x: x ** 5,
    lambda x, y: y * x ** 5,
]

library_function_names = [
    lambda x: x,
    lambda x, y: x +"*" + y,
    lambda x: x + "^5",
    lambda x, y: y +"*"+ x + "^5",
]

sindy_library = ps.PDELibrary(
    library_functions=library_functions,
    temporal_grid=t,
    function_names=library_function_names,
    include_bias=True,
    implicit_terms=True,
    derivative_order=1,
)

sindy_opt = ps.SINDyPI(
    threshold=1e-6,
    tol=1e-8,
    thresholder="l1",
    max_iter=20000,
    normalize_columns=True,
)

model = ps.SINDy(
    optimizer=sindy_opt,
    feature_library=sindy_library,
    differentiation_method=ps.FiniteDifference(drop_endpoints=True),
    feature_names=["m","x"],
)
model.fit(X, t=t)
model.print()
sindy_library.get_feature_names()
Jacob-Stevens-Haas commented 1 year ago

This seems like a case where SINDyPI is the wrong tool for the job. If you have a known value you want to pass with the data but not fit a differential equation for, set it as a control variable (the u in SINDy.fit()). Try a GeneralizedLibrary with libraries PolynomialLibrary(degree=1) for the m variable and a CustomLibrary with an indicator function for the time library. You can control which variables go to which library using the inputs_per_library argument. e.g.


funcs = [lambda t: t > t_r]
lib = GeneralizedLibrary(
    libraries = [PolynomialLibrary(degree=1), CustomLibrary(library_functions=funcs)],
    inputs_per_library = np.array([[0,0],[1,1]])
)
model = SINDy(library=lib)
model.fit(x=m, t=t, u=t)
penguinaugustus commented 1 year ago

Thank you so much for your reply here! I am still a little bit confused about what dose [lambda t: t > t_r] mean here. I run the code in this way:

funcs = [lambda t: t > tr]
lib = ps.GeneralizedLibrary(
    libraries = [ps.PolynomialLibrary(degree=1), ps.CustomLibrary(library_functions=funcs)],
    inputs_per_library = np.array([[0,0],[1,1]])
)
sindy_opt = ps.SINDyPI(
    threshold=1e-6,
    tol=1e-8,
    thresholder="l1",
    max_iter=20000,
    normalize_columns=True,   
)
model = ps.SINDy(optimizer=sindy_opt,feature_library=lib,feature_names=["m","t"])
model.fit(x=m, t=t, u=t)

and only to get:

Model  0
Model  1
Model  2
SINDy(differentiation_method=FiniteDifference(),
      feature_library=<pysindy.feature_library.generalized_library.GeneralizedLibrary object at 0x7f8fd1fd39d0>,
      feature_names=['m', 't'],
      optimizer=SINDyPI(max_iter=20000, model_subset=range(0, 3),
                        normalize_columns=True, threshold=1e-06, tol=1e-08))

here is a minimal working example:

import numpy as np
import matplotlib.pyplot as plt
import pysindy as ps
import gurobipy
from scipy.integrate import odeint
from scipy.integrate import solve_ivp

# Define the equations
def equations2(y, t, kt, kd, n, tr, tD):
    m, x = y
    dm_dt = kt + kt*(t**n / (tr**n + t**n)) - kd * m
    #dm_dt = kt  - kd * m
    dx_dt = 1
    return [dm_dt, dx_dt]
def m(t):
    kt, kd, tD, tr = 1.26, 0.126, 70, 27
    m = []
    for i in range(len(t)):
        if 0 <= t[i] < tr:
            m.append((kt/kd)*(1 - np.exp(-kd*(tD - tr)) / (2 - np.exp(-kd*tD)) * np.exp(-kd*t[i])))
        else:
            m.append((kt/kd)*(2 - (1 + np.exp(-kd*tD) / (2 - np.exp(-kd*tD))) * np.exp(-kd*(t[i] - tr))))
    return m

# Define the parameters and initial conditions
kt, kd, tD, tr = 1.26, 0.126, 70, 27

n = 3

#m0 = (kt / kd) * (1 - (np.exp(-kd * (tD - tr))) / (2 - np.exp(-kd * tD)))
m0 = 10
x0 = 0.0
y0 = [m0, x0]

# Define the time span
t = np.linspace(0, 70, 140)

mrna = m(t)

# Solve the equations
sol = odeint(equations2, y0, t, args=(kt, kd, n, tr, tD))
m = sol[:, 0]
x = sol[:, 1]

funcs = [lambda t: t > tr]
lib = ps.GeneralizedLibrary(
    libraries = [ps.PolynomialLibrary(degree=1), ps.CustomLibrary(library_functions=funcs)],
    inputs_per_library = np.array([[0,0],[1,1]])
)
sindy_opt = ps.SINDyPI(
    threshold=1e-6,
    tol=1e-8,
    thresholder="l1",
    max_iter=20000,
    normalize_columns=True,   
)
model = ps.SINDy(optimizer=sindy_opt,feature_library=lib,feature_names=["m","t"])
model.fit(x=m, t=t, u=t)
Jacob-Stevens-Haas commented 1 year ago

As I said, try it without SINDyPI. Try simulating it with the exact equations, rather than the parametrized approximation. In MWE, remove all the variables you're not using, e.g. x, mrna.

penguinaugustus commented 1 year ago

Problem solved. Thank you for your help!