AmpersandTV / pymc3-hmm

Hidden Markov models in PyMC3
Other
94 stars 13 forks source link

Add a functional test for time-varying transition matrices #67

Closed xjing76 closed 3 years ago

xjing76 commented 3 years ago

Model performance test around time varying transition matrices models.

Transition matrix with regression on seasonality

  1. Zero state
  2. Constant positive s
xjing76 commented 3 years ago

I removed all the additional tests and limited to only two constant states (one zero, one positive) with no regressions within each state. Right after we are able to generate the simulation data, we split the data into training and testing for out of sample metrics.

I added in manual setting the out of sample shape on DiscreteMarkovChain (V_t), which seems like not working.

ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (4738,2,2) and requested shape (2031,2,2)

Looking into the debugger, the gamma attribute of DiscreteMarkovChain is not getting resampled. Even we updated the shared variables.

Additionally, the shape of the entire distribution is modified yet and gammas themselves are not updated.

ipdb> self
<pymc3_hmm.distributions.DiscreteMarkovChain object at 0x7fa115199ac0>
ipdb> self.shape
(2031,)
ipdb> self.Gammas.shape.tag.test_value
array([4738,    2,    2])
xjing76 commented 3 years ago

It seems to be working pretty well on estimating the transition matrix regressions part with the current set up. However, the rate of actually predicting the correct state in the posterior setting is quite small.

I think often the transition matrices being symmetric along the diagonal as in the first example. But as there are quite some number of series we predicting ahead of time the transition matrix become kind of stateless. As the chances of being in either state is pretty much equal.

Otherwise I think it would make more sense to just move to stateless transition.

ipdb> adds_pois_ppc['V_t'][:, 30].mean()
0.46
ipdb> adds_pois_ppc['Gamma'][:, 30][0]
array([[0.9008662 , 0.0991338 ],
       [0.12714147, 0.87285853]])

ipdb> adds_pois_ppc['Gamma'][:, 150][0]
array([[0.01935663, 0.98064337],
       [0.06198774, 0.93801226]])
ipdb> adds_pois_ppc['V_t'][:, 150].mean()
0.96
xjing76 commented 3 years ago

I Also updated the gamma_0 to the very last record coming from the training set. But I think that actually did do much other than we are able to get a good estimate of the first record in the test series.

xjing76 commented 3 years ago

I switched to stateless transition matrix for this model test. The chances of predicting the state is being much higher, Which is kind of expected as the series themselves are less complicated. I set the metric for success of of a good prediction with less then 20% of errors