pymc-devs / pymc2

THIS IS THE **OLD** PYMC PROJECT (VERSION 2). PLEASE USE PYMC INSTEAD:
http://pymc-devs.github.com/pymc/
Other
879 stars 229 forks source link

instantiating stochastic nodes directly to build HMMs in PyMC #72

Open yarden opened 8 years ago

yarden commented 8 years ago

I'm writing a wrapper around pymc for dynamic models (such as HMMs, but also more complex hierarchical dynamic models.) One simple strategy is to unroll dynamic models over a time interval into a single graphical model. I tried to unroll the simple discrete 'umbrella-rain' HMM in pymc as follows:

import pymc

def umbrella_logp(value, rain):
    """
    Umbrella node.

    P(umbrella=True | rain=True)  = 0.9
    P(umbrella=True | rain=False) = 0.2
    """
    p_umb_given_rain = 0.9
    p_umb_given_no_rain = 0.2
    if rain:
        logp = pymc.bernoulli_like(rain, p_umb_given_rain)
    else:
        logp = pymc.bernoulli_like(rain, p_umb_given_no_rain)
    return logp

def umbrella_random(rain):
    return (np.random.rand() <= np.exp(umbrella_logp(True, rain)))

def rain_logp(value, rain):
    """
    P(rain=True @ t | rain=True @ t-1) = 0.8
    p(rain=True @ t | rain=False @ t-1) = 0.35
    """
    if rain:
        logp = pymc.bernoulli_like(rain, 0.8)
    else:
        logp = pymc.bernoulli_like(rain, 0.35)
    return logp

def rain_random(rain):
    return (np.random.rand() <= np.exp(rain_logp(True, rain)))

num_steps = 10
# prior probability of rain
p_rain = 0.5
# make all the time steps
from collections import OrderedDict
variables = OrderedDict()
for n in range(num_steps):
    if n == 0:
        # Rain node at time t = 0
        rain = pymc.Bernoulli("rain_%d" %(n), p_rain)
    else:
        rain = pymc.Stochastic(logp=rain_logp,
                               name="rain_%d" %(n),
                               doc="rain_%d" %(n),
                               parents={"rain":
                                        variables["rain_%d" %(n-1)]},
                               random=rain_random)
    # For now, I have to declare that umbrella is observed, as well as its observed value, 
    # here -- as part of model declaration (https://github.com/pymc-devs/pymc/issues/10)
    umbrella = pymc.Stochastic(logp=umbrella_logp,
                               name="umbrella_%d" %(n),
                               doc="umbrella %d" %(n),
                               parents={"rain": rain},
                               random=umbrella_random,
                               observed=True,
                               value=True)
    variables["rain_%d" %(n)] = rain
    variables["umbrella_%d" %(n)] = umbrella

# run inference
all_vars = variables.values()
model = pymc.Model(all_vars)
m = pymc.MCMC(model)
m.sample(iter=1000)
print "Mean value of rain @ t=8"
print np.mean(m.trace("rain_8")[:])

It's giving strange answers (non-probabilities). Is this because of the way Stochastic nodes are directly instantiated? Also, is there a way to condense make this more compact, either in pymc2 or perhaps by upgrading to pymc3? (I use Python 2.7 so would prefer not to upgrade, unless there's a significant win for these types of operations.) Thanks very much.