Closed falakmasir closed 9 years ago
Yes, just use the Multinomial class from pybasicbayes when you create the list of obs_distns. Let me know if that gives you trouble and I can write out an example.
Hi Matt,
I am having difficulty initializing the parameters. I will appreciate it if you would provide an example with either Categorical or Multinomial observations.
Best,
M.
This should run, but it needs the version of pybasicbayes I just pushed (27193e2) for the synthetic data generation to work (the rest should work with older pybasicbayes):
from __future__ import division
import numpy.random as npr
npr.seed(0)
import pyhsmm
from pyhsmm.basic.distributions import Multinomial
from pyhsmm.util.text import progprint_xrange
#############################
# generate synthetic data #
#############################
# replace this section with loading real data
num_states = 5
num_sequences = 3
T = 1000
# hypers set the number of possible outcomes (K) and a Dirichlet prior via a
# number of prior pseudo-observations (alpha_0) which represent a prior dataset
# consisting of alpha_0 observations of each outcome (see Wikipedia on Dirichlet
# distributions)
obs_hyp = dict(
K=10, # 10 possible observations
alpha_0=0.1, # 0.1 'pseudocounts' for each observation
)
truemodel = pyhsmm.models.HMM(
alpha=3., init_state_distn='uniform',
obs_distns=[Multinomial(**obs_hyp) for _ in range(num_states)])
data_sequences = [truemodel.rvs(T) for _ in range(num_sequences)]
##################
# set up model #
##################
Nmax = 20
# use the same hypers as above
obs_hyp = dict(
K=10, # 10 possible observations
alpha_0=0.1, # 0.1 'pseudocounts' for each observation
)
# instantiate a Sticky-HDP-HMM
obs_distns = [Multinomial(**obs_hyp) for state in xrange(Nmax)]
model = pyhsmm.models.WeakLimitStickyHDPHMM(
kappa=50.,alpha=6.,gamma=6.,init_state_concentration=1.,
obs_distns=obs_distns)
# add the data sequences
for data in data_sequences:
model.add_data(data)
################
# inference! #
################
for itr in progprint_xrange(100):
model.resample_model()
By the way, multinomials can be tricky unless you have enough data: if you are in the small N setting (i.e. a small total of the counts per observed count vector) then it's hard to estimate the multinomial parameters well and it's easy to over-segment. That was one of the original motivating examples for the Sticky HDP-HMM (see EB Fox's thesis).
Thanks!
Hi Matt,
Is it possible to run pyhsmm on Multinomial observations?
Best,
M.