Closed mamacneil closed 9 years ago
Isn't this a general problem with numpy? I don't know of a clean/fast way to do this in numpy either other than writing a custom ufunc, so I'm not sure if pymc will be able to do any better. You might consider writing a function in cython (http://wiki.cython.org/tutorials/numpy using either the cython specific features or the C API). That's what I've done when I need to do time series stuff.
I think what Aaron is referring to (if I'm not mistaken), is that when you build an array of variables (serially dependent or not), each element is updated independently with its own step method, rather than as a vector, which would be the case for a vector-valued stochastic. John, I was thinking that your step methods that deals with multiple stochastics simultaneously would be the ticket in this case (?).
If that's what he's referring to, then yes, the Hamiltonian step method should help, as it deals well with autocorrelated variables. It's still under development, but it's worked well for me on large problems with high autocorrelations. https://github.com/jsalvatier/gradient_samplers/
There are two sources of slowness: 1) inefficient sampling, 2) all the Python required to solve a deterministic model forward. It would be great to see how the Hamiltonian step method solves 1), Aaron if you try it please do post your findings to the list. For 2), John's suggestion is the way forward. You would make the entire trajectory into a single deterministic, and use Cython, Weave, numexpr, f2py or even ctypes to produce a fast implementation of the function mapping r, k and catch to the trajectory.
I think a recursive decorator like you've suggested is too specialized to go in the main PyMC distribution. However, using tools like the above, you could implement a little library to support fitting population models...
Currently the estimation of (e.g.) time dependent states is super slow, requiring a costly loop to get from a state t-1 to state t (such as the surplus production example on the pymc page). If these could somehow be passed into something faster using a keyword or decorator it would be a big deal for many kinds of ecological (and I suspect econometric) models. Could this be done? Something like: