Closed thommevans closed 11 years ago
Hi Tom,
If you put the variables you wish to block update in a list, PyMC will turn this into a container and block-update them. By default, it will probably use AdaptiveMetropolis, but you can specify any step method you wish (e.g. Metropolis) using MCMC's use_step_method() function. Let me know if this does not work.
cf
Thanks for the reply, but I'm still having some difficulty. I'm fairly sure I'm not quite following your instructions correctly but I haven't had much luck working out what it is I'm missing. To be more specific, I'm using the "model factory" method of defining a model, which is the most convenient for my current purposes. So something like:
def my_model(): a = pymc.Uniform(...) b = pymc.Normal(...) c = pymc.Normal(...) model_pars = [ a, b, c ] @pymc.stochastic( dtype=float, observed=True ) def target( value=dat, model_pars=model_pars ): def logp( value, model_pars=model_pars ): loglikelihood = ... return loglikelihood return locals()
where I define "model_pars" above as the container in the way I understood you were suggesting. Then I define an MCMC sampler with something like: M = my_model() MCMC = pymc.MCMC( M )
but run into problems when I try to tell the container that I want its values to be sampled with a Metropolis method: MCMC.use_step_method( pymc.Metropolis, mcmc.model_pars )
The error message I get is:
ValueError Traceback (most recent call last)
Is there a way you could post a version of this model so that I can debug it?
Here is an example that gives the same error I described above. Although there would be more concise ways of defining the data likelihood for this particular example I'm sure, my actual model is quite a bit more complicated with a specialised likelihood function that needs to be custom-defined (i.e. "def logp(...)"), which is why I've done it this way here. I should also point out that I've checked out the latest version of code.
import numpy as np
import pymc
def my_model():
a0 = 1.0
b0 = 0.5
c0 = 0.1
n = 300
x = np.linspace( -1, 1, n )
f = a0*(x**2.) + b0*x + c0
err = 0.07
dat = f + err*np.random.randn( n )
a = pymc.Normal( 'a', mu=1.1, tau=1./(0.1**2.) )
b = pymc.Uniform( 'b', lower=0.3, upper=1.5 )
c = pymc.Normal( 'c', mu=0.2, tau=1./(0.2**2.) )
model_pars = [ a, b, c ]
@pymc.stochastic( dtype=float, observed=True )
def target( value=dat, model_pars=model_pars ):
def logp( value, model_pars=model_pars ):
a = model_pars[0]
b = model_pars[1]
c = model_pars[2]
model_fit = a*(x**2.) + b*x + c
resids = value - model_fit
loglikelihood = pymc.distributions.normal_like( resids, 0.0, 1./(err**2.) )
return loglikelihood
return locals()
m = pymc.MCMC( my_model() )
m.use_step_method( pymc.Metropolis, m.model_pars )
OK, so it looks like I was wrong -- Metropolis only block-updates vector-valued stochastics. But, I have a hack that seems to do what you want to do. Try using AdaptiveMetropolis, but set the delay variable to be longer than your MCMC run. So, for a run of 100K:
m.use_step_method( pymc.AdaptiveMetropolis, m.model_pars, delay=100001 )
This appears to update the variances of the proposal, but not the covariances. Here is the state of proposal_sd
after 20K iterations:
'proposal_sd': array([[ 0.0355, 0. , 0. ],
[ 0. , 0.1064, 0. ],
[ 0. , 0. , 0.1201]]),
Allowing block updating for arbitrary sets of stochastics is something we should definitely allow in PyMC 3.
Thanks - your solution seems to do what I want.
Also, I like how by default the standard Metropolis step method records a tally of the adaptive_scale_factor for each stochastic variable as it is tuned throughout the sampling process (i.e. the 'Metropolis_c_adaptive_scale_factor' etc traces). This doesn't seem to be done by the AdapticMetropolis step method though - is there a reason for that or is it a feature that could easily be added?
And one other thing - I find the description of the "scales" input in Section 5.7.2 "AdaptiveMetropolis class" of the documentation to be a bit confusing... I understood it as saying that if the proposal covariance matrix is not explicitly provided through the "cov" argument, then it will initially be set to a diagonal matrix with kth entry corresponding to (stochastic_k*scale_k)^2. But in fact, it actually gets set to a diagonal matrix with diagonal given by the scales array., i.e. the scales input contains the variances of the initial covariance matrix which has zero off-diagonal terms. If I'm right, it might be good to edit the description in the documentation...?
Just to follow up on this - I actually think the above workaround isn't particularly good, because it doesn't seem to tune the step sizes sensibly. For instance, with the above code, I seem to be getting acceptance rates of <1%. I'm going to look into defining my own StepMethod class instead I think...
I'd also like to see the scale factors traced in AdaptiveMetropolis
, so :+1: to that feature request.
@tomevans - did you ever end up making your own StepMethod
class that does this nicely?
Suppose I have a model with 5x Normal and 5x Gamma stochastic variables. Is it possible to block-update all 10x variables at each MCMC step using a multivariate normal proposal distribution, with zero off-diagonal terms? I'd like the diagonal terms of the resulting proposal distribution to be tuned during the sampling, as can be done when you use the Metropolis step method to update variables individually.
I realise that the AdaptiveMetropolis step method does something like this, but it also calculates the covariance between variables based on previous steps and then uses this to create a multivariate normal proposal distribution with non-zero off diagonal terms.
I also realise that I could bundle all of the 5x Normal stochastics into a single array-valued Normal stochastic and use the Metropolis step method, but I'm pretty keen to block update the Gamma stochastics along with the Normal stochastics at each step as well.
I guess the reason I'm after this is because (1) I want to save time by taking 10x less steps overall when I have 10x stochastics; (2) I'm a little hesitant to take correlated steps with the AdaptiveMetropolis in high-ish dimensional parameter spaces that I don't have a good feel for at the outset. Unless there's a particularly cautious approach for doing this?
Thanks for any advice