pymc-devs / pymc

Bayesian Modeling and Probabilistic Programming in Python
https://docs.pymc.io/
Other
8.71k stars 2.01k forks source link

Stochastic variable is not being sampled properly in pymc3 #981

Closed junpenglao closed 8 years ago

junpenglao commented 8 years ago

Hi there, I am fairly new to Bayesian computation and pymc3, so this is probably a very naive question. I am trying to adapt a pymc2 code to pymc3. It is a latent-mixture model from Lee and Wagenmakers' Bayesian Cognitive Modeling - A Pracitcal Course on page 93 (Chapter 6 example 7). While the pymc2 code runs without any problem, there is a problem in the pymc3 model, one of the stochastic variable is not being sampled properly...

pymc2 code:

k = np.array([26,23,39,34,28,23,36,32,29,31,40,23,40,39,25,40,34,29,38,37,33,30,37,31,40
,34,38,23,39,19,35,26,38,18,31,37,40,34,33,40,36,35,21,37,23,40,39,37,36,34
,31,32,34,30,40,24,33,40,40,23,39,22,23,32,37,40,25,34,40,27,35,32,40,36,40
,29,30,35,39,28,27,40,32,40,25,30,26,37,17,38,30,33,36,33,27,38,34,40,20,40
,25,38,20,37,32,40,21,40,34,40,31,37,27,34,36,36,21,26])
t1 = np.array([1,0,1,0,0,0,1,1,0,0,1,0,1,1,0,1,0,0,1,1,0,0,
               1,0,1,0,1,0,1,0,1,0,1,0,0,1,1,0,0,1,1,1,0,0,
               0,1,0,1,1,0,0,1,0,1,1,0,0,1,1,0,1,0,1,0,0,1,
               0,0,1,0,1,0,1,0,1,0,1,0,1,0,0,1,0,1,0,1,0,1,
               0,1,1,0,1,0,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,
               0,1,0,1,0,1,0,1])

p = len(k) # number of people
n = 40     # number of questions

import pymc as pymc

mub = pymc.Beta("mub",alpha=1,beta=1)
mud = pymc.HalfNormal("mud",tau=1/.5**2)
lambdab = pymc.Uniform("lambdab",lower=5,upper=50)
lambdac = pymc.Uniform("lambdac",lower=5,upper=50)
psi = pymc.Beta("psi",alpha=5,beta=5)

zi = pymc.Bernoulli("zi",p=psi,size=p)

@pymc.deterministic
def muc(mub=mub,mud=mud):
    return pymc.invlogit(pymc.logit(mub)+mud)

theta1 = pymc.Beta("theta1",alpha=mub*lambdab,beta=(1-mub)*lambdab)
theta2 = pymc.Beta("theta2",alpha=muc*lambdac,beta=(1-muc)*lambdac)
# theta = theta1*(zi==0)+theta2*(zi==1)
def thetafunc(theta1=theta1,theta2=theta2,zi=zi):
    return theta1*(zi==0)+theta2*(zi==1)
theta = pymc.Deterministic(eval = thetafunc,
                  name = 'theta',
                  parents = {'theta1': theta1,
                          'theta2': theta2,
                          'zi': zi},
                  doc = 'Theta',
                  trace = False,
                  verbose = 0,
                  dtype=float,
                  plot=False)

kij = pymc.Binomial("kij",p=theta, n=n, value=k, observed = True)

testmodel = pymc.Model([theta1,theta2,psi,zi,kij])
mcmc = pymc.MCMC(testmodel)
mcmc.sample(iter = 20000, burn = 50, thin = 2)
pymc.Matplot.plot(mcmc)

pymc3 code:

import pymc3 as pm

with pm.Model() as model7:
    # prior
    mub = pm.Beta("mub",alpha=1,beta=1)
    mud = pm.HalfNormal("mud",tau=1/.5**2)
    lambdab = pm.Uniform("lambdab",lower=5,upper=50)
    lambdac = pm.Uniform("lambdac",lower=5,upper=50)
    psi = pm.Beta("psi",alpha=5,beta=5)
    # psi = pm.Uniform("psi",lower=0,upper=1)
    zi = pm.Bernoulli("zi",p=psi,shape=p)

    muc = pm.Deterministic("muc",1/(1+T.exp(T.log(1/mub-1)-mud)))
    theta1 = pm.Beta("theta1",alpha=mub*lambdab,beta=(1-mub)*lambdab)
    theta2 = pm.Beta("theta2",alpha=muc*lambdac,beta=(1-muc)*lambdac)

    theta = pm.Deterministic("theta",theta1*(T.eq(zi,0))+theta2*(T.eq(zi,1)))
    # theta = theta1*(T.eq(zi,0))+theta2*(T.eq(zi,1))
    # observed
    kij = pm.Binomial("kij",p=theta, n=n, observed = k)

    start = pm.find_MAP()
    start['zi']=np.random.binomial(1,.5,p)

#     step1 = pm.Metropolis(zi)
#     step2 = pm.NUTS()
#     trace7=pm.sample(1e4, start = start, step = [step1,step2], model=model7)

    step = pm.Metropolis()
    trace7=pm.sample(1e4, start = start, step = step, model=model7)

    # step1=pm.NUTS(psi)
    # step2=pm.Metropolis([theta1,theta2])
    # trace7=pm.sample(1e4, step=[step1,step2], model=model7)

burnin=0
pm.traceplot(trace7[burnin:],varnames=['psi','theta1','theta2']);
plt.show()

(I tried different sampling method but the result is the same.)

The problem seems to be that: the grouping variable vector "zi" was not updated in the sampling. It stuck in the initial value and never being update in each sampling.

You can have a look at the output from the notebook here

twiecki commented 8 years ago

The issue is probably the lack of gibbs updating of vector-valued variables. Thus, a jump is only accepted if all binary values produce a good logp. This PR might be helpful: https://github.com/pymc-devs/pymc3/pull/799

So you can try: pip install git+https://github.com/pymc-devs/pymc3@gibbs and then do Metropolis(gibbs='random').

junpenglao commented 8 years ago

Thank you for your reply @twiecki , however the problem is still remain... There is still no jump in the binary vector.

fonnesbeck commented 8 years ago

You should not be assigning Metropolis to Bernoulli random variables. Its best just to let PyMC3 select step methods for you, unless you have reason to do otherwise. All you need to call is:

trace7 = pm.sample(1e4, start=start)
junpenglao commented 8 years ago

Thanks @fonnesbeck , now the Bernoulli variable does jump. However, it is really easily stuck in local minimal and the model is not converging to the optimal estimation I am not sure if it's related it, but pm.find_MAP() return zi all ones. That's why I manually set it to random start['zi']=np.random.binomial(1,.5,p)

twiecki commented 8 years ago

@junpenglao I just pushed an update to the gibbs branch that adds Gibbs sampling to the BinaryMetropolis sampler. Can you update and move to that branch and try again using automatic step method assignment?

image

twiecki commented 8 years ago

Just pushed another update. If you test the gibbs branch now there should be a new step method assigned: Assigned BinaryGibbsMetropolis to zi and you should get better convergence and mixing of the binary assignments.

junpenglao commented 8 years ago

Yes that works perfectly, thanks so much @twiecki

twiecki commented 8 years ago

@junpenglao have you ported more examples from the Wagenmakers book?

junpenglao commented 8 years ago

@twiecki I have. I am working through his book and so far at chapter 8. I was planning to upload them to Github when I finish. It will be my honor if you guys would like to include them as example.

springcoil commented 8 years ago

Plus one for this :)

On Tue, Feb 16, 2016 at 4:50 PM, Junpeng Lao notifications@github.com wrote:

@twiecki https://github.com/twiecki I have. I am working through his book and so far at chapter 8. I was planning to upload them to Github when I finish. It will be my honor if you guys would like to include it as example.

— Reply to this email directly or view it on GitHub https://github.com/pymc-devs/pymc3/issues/981#issuecomment-184741152.

Peadar Coyle Skype: springcoilarch www.twitter.com/springcoil peadarcoyle.wordpress.com

twiecki commented 8 years ago

@junpenglao Definitely, that book got me started with Bayesian stats.

junpenglao commented 8 years ago

Me too! I only wish I have discovered it sooner.

twiecki commented 7 minutes ago @junpenglao Definitely, that book got me started with Bayesian stats.

junpenglao commented 8 years ago

@twiecki @springcoil I had ported most of the book in https://github.com/junpenglao/Bayesian-Cognitive-Modeling-in-Pymc3. Some models still need to be further optimized but most of them return the same result as in the book.

fonnesbeck commented 8 years ago

This is really nice, thanks!