SciML / DataDrivenDiffEq.jl

Data driven modeling and automated discovery of dynamical systems for the SciML Scientific Machine Learning organization
https://docs.sciml.ai/DataDrivenDiffEq/stable/
MIT License
402 stars 57 forks source link

STLSQ method results in different models from pySINDy? #475

Open xk-y opened 1 year ago

xk-y commented 1 year ago

Hi! I was starting to use STLSQ method to create a SINDy-based model. I try the example here, but remove the noise in the reference data. When I compare the model parameters, I found the STLSQ method in Julia and pySINDy gives different results. I tried to keep all hyperparameters the same. Do you have any idea why it happens? I suppose pySINDy and DataDrivenDiffEq.jl should give the same result as the STLSQ algorithm are the same.

Julia code:

using DataDrivenDiffEq
using LinearAlgebra
using OrdinaryDiffEq
using DataDrivenSparse

function pendulum(u, p, t)
    x = u[2]
    y = -9.81sin(u[1]) - 0.3u[2]^3 - 3.0 * cos(u[1]) - 10.0 * exp(-((t - 5.0) / 5.0)^2)
    return [x; y]
end

u0 = [0.99π; -1.0]
tspan = (0.0, 15.0)
prob = ODEProblem(pendulum, u0, tspan)
sol = solve(prob, Tsit5(), saveat = 0.01);

X = sol[:, :] #+ 0.2 .* randn(rng, size(sol));
ts = sol.t;

using JLD2
JLD2.jldsave("data.jld";X,ts) ### save the data for Python usage

function create_prob(X, ts,i)
    (X = X, t = ts )
end

probnames = Tuple(Symbol.(["prob$i" for i in 1:1]));
probdata = Tuple([create_prob(X, ts,i) for i in 1:1]);
probtuple = NamedTuple{probnames}(probdata);
prob = DataDrivenDiffEq.ContinuousDataset(probtuple);

#prob = ContinuousDataDrivenProblem(X, ts)

@parameters  t 
@variables u(t)[1:2] 

h = Num[polynomial_basis(u, 5);]

basis = Basis(h, u);

λs = (1e-5)
opt = STLSQ(λs)
res = solve(prob, basis, opt)

system = get_basis(res)
params = get_parameter_map(system)
println(system) 
println(params)

The model I got from Julia is:

Model ##Basis#465 with 2 equations
States : (u(t))[1] (u(t))[2]
Parameters : 42
Independent variable: t
Equations
Differential(t)((u(t))[1]) = p₁ + p₁₂*((u(t))[2]^2) + p₁₆*((u(t))[2]^3) + p₁₉*((u(t))[2]^4) + p₂*(u(t))[1] + p₂₁*((u(t))[2]^5) + p₃*((u(t))[1]^2) + p₄*((u(t))[1]^3) + p₅*((u(t))[1]^4) + p₆*((u(t))[1]^5) + p₇*(u(t))[2] + p₁₃*((u(t))[2]^2)*(u(t))[1] + p₁₀*((u(t))[1]^3)*(u(t))[2] + p₁₄*((u(t))[1]^2)*((u(t))[2]^2) + p₁₅*((u(t))[1]^3)*((u(t))[2]^2) + p₁₈*((u(t))[1]^2)*((u(t))[2]^3) + p₁₇*((u(t))[2]^3)*(u(t))[1] + p₉*((u(t))[1]^2)*(u(t))[2] + p₁₁*((u(t))[1]^4)*(u(t))[2] + p₂₀*((u(t))[2]^4)*(u(t))[1] + p₈*(u(t))[1]*(u(t))[2]
Differential(t)((u(t))[2]) = p₂₂ + p₂₃*(u(t))[1] + p₂₄*((u(t))[1]^2) + p₂₅*((u(t))[1]^3) + p₂₆*((u(t))[1]^4) + p₂₇*((u(t))[1]^5) + p₂₈*(u(t))[2] + p₃₃*((u(t))[2]^2) + p₃₇*((u(t))[2]^3) + p₄₀*((u(t))[2]^4) + p₄₂*((u(t))[2]^5) + p₂₉*(u(t))[1]*(u(t))[2] + p₃₅*((u(t))[1]^2)*((u(t))[2]^2) + p₃₉*((u(t))[1]^2)*((u(t))[2]^3) + p₃₀*((u(t))[1]^2)*(u(t))[2] + p₃₁*((u(t))[1]^3)*(u(t))[2] + p₃₂*((u(t))[1]^4)*(u(t))[2] + p₃₆*((u(t))[1]^3)*((u(t))[2]^2) + p₃₄*((u(t))[2]^2)*(u(t))[1] + p₃₈*((u(t))[2]^3)*(u(t))[1] + p₄₁*((u(t))[2]^4)*(u(t))[1]

Pair{SymbolicUtils.BasicSymbolic{Real}, Float64}[p₁ => 0.0730131366, p₂ => 0.0834717146, p₃ => 0.0149782394, p₄ => -0.0059705709, p₅ => -0.0017038288, p₆ => -0.0001075822, p₇ => 1.0212420262, p₈ => 0.0228230317, p₉ => 0.0040984396, p₁₀ => 4.16304e-5, p₁₁ => -2.02122e-5, p₁₂ => 0.0037135607, p₁₃ => 0.0034812525, p₁₄ => 0.0006379738, p₁₅ => 1.52776e-5, p₁₆ => 0.0026753519, p₁₇ => 0.0005161771, p₁₈ => 5.01583e-5, p₁₉ => 0.0001611863, p₂₀ => 6.49419e-5, p₂₁ => -4.48301e-5, p₂₂ => -14.5630872419, p₂₃ => -16.616734041, p₂₄ => -2.9673310169, p₂₅ => 1.1849735762, p₂₆ => 0.3376741619, p₂₇ => 0.0213132928, p₂₈ => -3.9165878933, p₂₉ => -4.3162794295, p₃₀ => -0.819936348, p₃₁ => -0.0193343773, p₃₂ => 0.0031036009, p₃₃ => -0.551816404, p₃₄ => -0.5307240406, p₃₅ => -0.1191590027, p₃₆ => -0.0044727192, p₃₇ => -0.5195042192, p₃₈ => -0.0723909027, p₃₉ => -0.0066535181, p₄₀ => -0.0399888287, p₄₁ => -0.0133952208, p₄₂ => 0.0077426642]

Python code:

import numpy as np
import pysindy as ps
import h5py

X = d["X"][:]
times = d['ts'][:]
dt = 0.01

optimizer = ps.STLSQ(threshold=1e-5)
feature_library = ps.PolynomialLibrary(degree=5)
model = ps.SINDy(feature_names=["x", "y"],optimizer=optimizer,feature_library=feature_library)
model.fit([X], t=dt, multiple_trajectories = True)
model.print(precision=10)

The model I got from Python is:

(x)' = -0.0007809696 1 + -0.0008913440 x + 0.9991483762 y + -0.0001892916 x^2 + -0.0008836917 x y + -0.0002849399 y^2 + -0.0000113477 x^3 + -0.0002544056 x^2 y + -0.0001793808 x y^2 + -0.0000349313 y^3 + -0.0000207300 x^3 y + -0.0000688229 x^2 y^2 + 0.0000031815 x y^3 + -0.0000064892 x^3 y^2
(y)' = -14.5342725185 1 + -16.5809246400 x + -3.9699650420 y + -2.9592877785 x^2 + -4.3170609722 x y + -0.5520409493 y^2 + 1.1846939143 x^3 + -0.8008512434 x^2 y + -0.5160813109 x y^2 + -0.5070188768 y^3 + 0.3374568507 x^4 + -0.0137858781 x^3 y + -0.1128667395 x^2 y^2 + -0.0672355163 x y^3 + -0.0364527735 y^4 + 0.0212981524 x^5 + 0.0035045963 x^4 y + -0.0038914762 x^3 y^2 + -0.0060928204 x^2 y^3 + -0.0128738435 x y^4 + 0.0081038320 y^5

I could find the model parameters are not the same.

ChrisRackauckas commented 1 year ago

I'm not sure there's an expectation for it to be exactly the same since IIRC STLSQ is quite numerically non-robust to floating point differences and so small difference in the factorization algorithm or convex optimization would lead to different rejections. I'm curious where the expectation is derived from.

ctessum commented 1 year ago

We're just trying to understand how it works. Would a reasonable summary be that if if there's a single system of equations that exactly describes the data (like if we're trying to recover the lotka-volterra system or something like that), then both implementations should return the same result, but if there are multiple plausible explanations then we'd expect the two implementations to return different results?

ctessum commented 1 year ago

The next question would be if we want to a single implementation to give us all of the different plausible explanations, how would we do that? Just run it multiple times with different noise added every time?

ChrisRackauckas commented 11 months ago

This should be re-written to use a convex optimizer. That would then be more robust. I talked with @Vaibhavdixit02 about doing this.