Closed michaelosthege closed 6 years ago
ODE are difficult to initialize, you might want to set the init='advi'
for comparison under pymc3.2
Well spotted! This indeed solves the problem. Now it's even slightly faster at 8.09 it/s! Thank you very much!
Just a note on that performance difference: The number of samples per second is not a good measure for sampler performance. What we are interested are effective samples per second, and that might (or might not) be better with the adapted mass matrix than with the one provided by advi.
About your model: I didn't go through it in detail, but it looks like you implemented the gradient of the ode by moving the runge-kutta integrator into theano. I'm not sure if this is the best way to do this. I've only ever done this in the context of PDEs, but it probably is better to solve the adjoint ode instead of the original one to compute the gradient. That way you can even use any ode solver you like. Here is a short intro.
You are using uniform priors on some parameters. In many cases this makes life harder for the sampler, as the geometry of the posterior can get strange at the boundaries. Unless you need hard limits, something like a normal is better most of the time.
Thank you for the suggestions - I will have a look at it.
I noticed the issue because the NUTS-sampling in my sampler-comparison workflow suddenly took ages. (It slowed down by almost 100x)
The workflow uses different samplers on the same model, checks convergence and in the end it will plot the effective sampling rate.
With this particular model (not the Lorenz attractor) I expect NUTS to be the fastest. However, DEMetropolis is remarkably fast (N_effective/h) too.
Description
I recently noticed that NUTS performance became terrible on my ODE models. Here is an example with the Lorenz attractor, implemented such that NUTS can be used to sample the initial conditions (x, y, z) and parameters (a, b, c) of the model.
Example code: https://gist.github.com/michaelosthege/a75b565d3f653721fa235a07eb089912
Traceback
Here I ran
In between I
Ctrl+C
ed the first run...Versions
If you scroll to the right, you can see the performance indicators;
Both of these are in the same freshly installed environment. Python 3.5 on Windows. On my Linux machines I get the same slowdown, but did not bother to test the version specifically. (They also ran fine, just a while ago.)
Analysis
With the Visual Studio performance profiler, I can see, that within
_Tree._build_subtree
a lot of time is spent inpymc3.model.ValueGradFunction.__call__
/.../theano.scan_module.scan_perform.perform
. I assume this is not thett.scan
that I use for ODE-solving, because if I manually assignstep=pymc3.Slice()
it will still usett.scan
in the forward pass, but the performance is stable at 1.64 it/s.I am aware that an ODE system is quite an exotic model for pymc3, but I get the feeling that the performance drawbacks I see here are merely an amplification of performance issues that others have observed (eg. https://github.com/pymc-devs/pymc3/issues/2723#issuecomment-345982298)