SciML / DiffEqBayes.jl

Extension functionality which uses Stan.jl, DynamicHMC.jl, and Turing.jl to estimate the parameters to differential equations and perform Bayesian probabilistic scientific machine learning
https://docs.sciml.ai/DiffEqBayes/stable/
Other
121 stars 29 forks source link

DynamicHMC.jl backend #16

Closed ChrisRackauckas closed 6 years ago

ChrisRackauckas commented 6 years ago

@tpapp

A simple test problem is:

using DiffEqBayes, OrdinaryDiffEq, ParameterizedFunctions, RecursiveArrayTools

println("One parameter case")
f1 = @ode_def_nohes LotkaVolterraTest1 begin
  dx = a*x - b*x*y
  dy = -c*y + d*x*y
end a=>1.5 b=1.0 c=3.0 d=1.0
u0 = [1.0,1.0]
tspan = (0.0,10.0)
prob1 = ODEProblem(f1,u0,tspan)

# Generate data
sol = solve(prob1,Tsit5())
t = collect(linspace(1,10,10))
randomized = VectorOfArray([(sol(t[i]) + .01randn(2)) for i in 1:length(t)])
data = convert(Array,randomized)
priors = [Normal(1.5,1)]

It's the Lotka-Volterra equation. The lines at the bottom make data a matrix generated from the values of the model when a=1.5. The goal is to recover that parameter. To make a problem with new parameters and solve it, you can use:

tmp_prob = problem_new_parameters(prob,p)
sol = solve(tmp_prob,Tsit5())

where p is the new parameters. That's all that's need to make new solutions and check against data. Let's get a quick prototype script together before "packaging" it.

finmod commented 6 years ago

Please, make this exercise meaningful by estimating simulateneously all four parameters instead of this trivial case.

tpapp commented 6 years ago

See this gist: https://gist.github.com/tpapp/019ccbd450c996410a0d0ab1514ab16c

Everything works, but if I start further away from the mode stepsize adaptation can stop working, because those parameters give amazingly different likelihood. In a real-life setting I would maximize the posterior first for a good starting point (if that makes sense).

ChrisRackauckas commented 6 years ago

Please, make this exercise meaningful by estimating simulateneously all four parameters instead of this trivial case.

No, this is supposed to be a quick test case for CI, not a real benchmark.

tpapp commented 6 years ago

@ChrisRackauckas: I updated the gist a bit, finding the mode with Optim, which I think Stan also does for initial adaptation. Now everything works fine, almost automatically.

Let me know what you think. I think the approach is viable, but it would be interesting to see a more complex model. My intuition is that for ODEs, different parameters can lead to wildly different solutions in a nonlinear fashion, the posterior will be highly peaked unless noise is added. These sharp posterior mode with equally sharp false modes provide a challenge for the algorithm. Perhaps for a real life problem, I would initialize the chain with random points. But I am not experienced with ODEs.

ChrisRackauckas commented 6 years ago

Cool. I'll find a way to make this generic and package it. For a somewhat harder problem, try it with

f1 = @ode_def_nohes LotkaVolterraTest1 begin
  dx = a*x - b*x*y
  dy = -c*y + d*x*y
end a=>1.5 b=>1.0 c=>3.0 d=>1.0

i.e. a vector of 4 parameters, and then problem_new_parameters should take in that vector each time.

tpapp commented 6 years ago

Are there any restrictions on the four parameters (eg nonnegative etc), or is the whole ℝ⁴ valid?

Also, looking at the code, it seems that you transform an ODE to a loop with an algorithm for stepping, and not rely on the ODE features of Stan. I am not familiar with ODEs, and did not look at @ode_def_nohes in detail, but since DynamicHMC effectively takes a black box that takes a vector and returns a value/gradient as a DiffResult, you have more freedom to organize your code.

ChrisRackauckas commented 6 years ago

Are there any restrictions on the four parameters (eg nonnegative etc), or is the whole ℝ⁴ valid?

Oh, I think I forgot to mention that this model makes sense only when these parameters are non-negative. Those kind of restrictions are model-dependent of course.

it seems that you transform an ODE to a loop with an algorithm for stepping, and not rely on the ODE features of Stan.

That is correct. Only in the case of Stan do we use their ODE stuff, but that's because we're forced to.

I am not familiar with ODEs, and did not look at @ode_def_nohes in detail, but since DynamicHMC effectively takes a black box that takes a vector and returns a value/gradient as a DiffResult, you have more freedom to organize your code.

Yup, and I'm going to abuse this to my advantage. Your code doesn't even need to the macro for one.

ChrisRackauckas commented 6 years ago

I'll wait until it's all registered to integrate it.

ChrisRackauckas commented 6 years ago

Do you think that making the prior a truncated normal would do better since a>0 must hold? That might be part of what's throwing it off.

tpapp commented 6 years ago

I did not follow up on the new problem, because adding DiffEqBayes downgrades DataFrames for me so I would rather not install it now.

I am happy to work in this, but would prefer a setup where we simply have a way of generating values from parameters, without the macrology. My understanding is that this is possible with just the standard tools in the DiffEq ecosystem.

On a related note, I found BlackBoxOptim.jl to be just great for finding a staring point for tuning. Easily worth the extra few seconds it takes.

ChrisRackauckas commented 6 years ago

I am happy to work in this, but would prefer a setup where we simply have a way of generating values from parameters, without the macrology. My understanding is that this is possible with just the standard tools in the DiffEq ecosystem.

Yes, let me give you a test problem without the macros.

I did not follow up on the new problem, because adding DiffEqBayes downgrades DataFrames for me so I would rather not install it now.

Huh. We should look into that then. Thanks for alerting me. I don't think it's in our dependency tree though, so that's weird. It must be an indirect hit.

On a related note, I found BlackBoxOptim.jl to be just great for finding a staring point for tuning. Easily worth the extra few seconds it takes.

Sweet!

ChrisRackauckas commented 6 years ago

Implemented.