ChrisRackauckas / universal_differential_equations

Repository for the Universal Differential Equations for Scientific Machine Learning paper, describing a computational basis for high performance SciML
https://arxiv.org/abs/2001.04385
MIT License
219 stars 59 forks source link

issue with regression with SInDy #21

Closed matutenun closed 4 years ago

matutenun commented 4 years ago

https://github.com/ChrisRackauckas/universal_differential_equations/blob/dd836890e7e09923a0ae9ed8b6b49d2ee64b2e6a/SEIR_exposure/seir_exposure.jl#L228

Ψ = SInDy(X̂[:, 2:end], L̂[2:end], basis, thresholds, opt = opt, maxiter = 10000, normalize = true, denoise = true) # Succeed

the base i get is different from the one I am suppose to obtain, not sure why. if I do print_equations(Ψ) println(parameters(Ψ))

1 dimensional basis in ["u₁", "u₂", "u₃"] f_1 = p₁ u₂ + u₁ u₂ p₂ + u₁ ^ 2 u₂ * p₃ with p1 , p2, p3= 0.021474172006328535, 0.023860980751708793, 0.026513078226048998

And I am suppose to get : (u₂ 0.3065241445838003 + u₁ 0.0011560597253354426)

ChrisRackauckas commented 4 years ago

Closing this as just a versioning issue. Since the real term is not in the dictionary (it's something to the 1126 power or something like that, see the equations), it's not defined what term will happen even as data->infinity. This is unlike the Lotka-Volterra example, where the true term is in the dictionary so you should get it when the data is large enough. Here, you get a sparse approximating model. We noticed that after SInDy was updated to a more optimized optimizer, it turned out this term changed a little bit from the linear one to one with some nonlinear interactions (the two you show here). Both extrapolate fine, so I'm sure you can tune the parameters of SInDy to pull out slightly different models (but if you raise the threshold too high you get something not sparse of course).

So I wouldn't say this is a versioning issue but rather a modeling question, as the sparse fitting forms are not unique here, but still give different forms that both extrapolate well. I don't know how to highlight this well but hopefully it's clear from playing around with it.

matutenun commented 4 years ago

thanks for you answer Chris. If i understand correct, both basis should extrapolate in a similar way... but It does not seem to be the case here. The one I got straight from running the example does not seem to do a good job. But! I will double check everything now that I updated all the packages. I was having issues with line 235 z = Ψ([S/N,I,D/N]) with the error "MethodError: objects of type SparseIdentificationResult are not callable" so I had to code the function by directly inserting F([S/N,I,D/N])= p₁ u₂ + u₁ u₂ p₂ + u₁ ^ 2 u₂ * p₃ . Thus I could and probably be wrong. Will follow up here. Thanks!

ChrisRackauckas commented 4 years ago

Interesting. I'd say file an issue on DataDrivenDiffEq if something seems up. We should make that have integration tests with these examples.

AlCap23 commented 4 years ago

I'll have a look asap, so possibly within the next week. Definitely will update the example with a PR to match the interface.