Closed brandonwillard closed 4 years ago
All right, the basics appear to be in working order. Here are some of the stand-out high-level changes:
HMMStateSeq
no longer takes an N
(i.e. sequence length) argument; instead, one must specify the shape
keyword argument—like most other multidimensional Distribution
s in PyMC3. In nearly all of our use-cases, shape
will be equal to the old values we used for the N
argument (except in a tuple).HMMStateSeq
and the other functions that deal with transition matrices now expect arrays with shapes like (N, M, M)
, where M
is the number of distinct states—as usual—and N
is either the sequence length (i.e. a transition matrix for every "time" point) or N = 1
, indicating that there is only one transition matrix that is to be broadcast across all "time". In most cases, one simply needs to add an extra leading broadcast dimension to their existing transition matrices (e.g. tt.shape_padleft(P_tt)
).While running the current tests, I did notice that FFBS sampling was a little slow. I've been thinking about finally using the Cython code I wrote for this a while back (after making the adjustments introduced here), but I'll save that for the next PR, if necessary.
Otherwise, I'm open to more scrutiny/testing, since there were a lot of important and sometimes confusing changes introduced here.
FYI: I'm writing tests to determine whether or not the log-likelihoods are being computed correctly.
This PR introduces updates that allow for time-varying transition probability matrices.