jmschrei / pomegranate

Fast, flexible and easy to use probabilistic modelling in Python.
http://pomegranate.readthedocs.org/en/latest/
MIT License
3.29k stars 590 forks source link

[QUESTION] Best practice for initializing HMM language model? #1099

Open ivnle opened 1 month ago

ivnle commented 1 month ago

Thanks for putting together this great package! I'm trying wrap my head around Pomegranate by training an HMM language model. Based on what i've read in the documentation https://pomegranate.readthedocs.io/en/latest/tutorials/B_Model_Tutorial_4_Hidden_Markov_Models.html and https://pomegranate.readthedocs.io/en/latest/api.html, I've put together the following implementation: https://colab.research.google.com/drive/1wbG2mvQEpqaMm9poxclIq15_xg5e_wrq?usp=sharing

Can you confirm if this is the correct way to initialize an HMM where all parameters (emission, transition, initial state) are updated?

# 1. each hidden state is a categorical distribution over the vocabulary
ds = [] # [h, v]
for _ in range(n_hidden_states):
    d = torch.ones(1, vocab_size)
    noise = torch.randn_like(d) * std
    d = d + noise
    d = (d / d.sum()) # [v]
    ds.append(Categorical(d)) 
# 2. define transition matrix
edges = torch.ones(n_hidden_states, n_hidden_states)
noise = torch.randn_like(edges) * std
edges = edges + noise
edges = edges / edges.sum(dim=1, keepdim=True) # [h, h]
edges = edges
# 3. initial states
starts = torch.ones(n_hidden_states)
noise = torch.randn_like(starts) * std
starts = starts + noise
starts = starts / starts.sum() # [h]
starts = starts
model = DenseHMM(distributions=ds, edges=edges, starts=starts, verbose=True, tol=0.00001, max_iter=10)
jmschrei commented 1 month ago

The looks correct at first glance. Are you encountering problems?