mattjj / pyhsmm

MIT License
546 stars 173 forks source link

HMM VLB #45

Closed nfoti closed 9 years ago

nfoti commented 9 years ago

It looks like the part of the variational lower-bound for HMMs is computed with the line normalizer = np.logaddexp.reduce(alphal[0] + betal[0]), but this doesn't seem right. I think it should be something like np.sum(np.logaddexp.reduce(alphal, axis=1)) which computes the normalization for the forward messages at each time and then combines them. This corresponds to Eq. (3.74) in Ch. 3 of Beal's thesis. Maybe I'm missing some cleverness though.

mattjj commented 9 years ago

I believe it's correct as it is: Eq. (3.60) in that thesis shows that Z is just the normalizer for the HMM MRF, which we can compute from combining the forwards and backwards the messages at any one index (like the code does using the first index). The \zeta defined Eq. (3.22) comprise an alternative way of writing the log normalizer in terms of filtering predictive probabilities (which turns out to be the most convenient way to compute the log normalizer for Gausisan LDS as far as I can tell!), but they're not the local potential normalizers from the two-way messages.

Actually something I realized (which isn't clear in Beal's thesis) is that this same thing happens for any conjugate mean field update to an exponential family factor, and this update to the HMM factor is just an instance of that phenomenon: right after the factor gets updated, its VLB term is just its log normalizer evaluated at its new variational natural parameters. Here are some symbols for an analogous (but not fully general) setup:

latex-image-1

(Apologies for any errors in that, which I typeset quickly, but I believe the general idea should be right!)

nfoti commented 9 years ago

Hmm, I'm getting different results using the current code vs. the proposed in my version of the code (which is very close your yours). I'll try to get an example working with pyhsmm to try it out (very possible I have a bug in mine somewhere).

Thanks.

On Sat, May 16, 2015 at 10:13 PM Matthew Johnson notifications@github.com wrote:

I believe it's correct as it is: Eq. (3.60) in that thesis shows that Z is just the normalizer for the HMM MRF, which we can compute from combining the forwards and backwards the messages at any one index (like the code does using the first index). The \zeta defined Eq. (3.22) comprise an alternative way of writing the log normalizer in terms of filtering predictive probabilities (which turns out to be the most convenient way to compute the log normalizer for Gausisan LDS as far as I can tell!), but they're not the local potential normalizers from the two-way messages.

Actually something I realized (which isn't clear in Beal's thesis) is that this same thing happens for any conjugate mean field update to an exponential family factor, and this update to the HMM factor is just an instance of that phenomenon: right after the factor gets updated, its VLB term is just its log normalizer evaluated at its new variational natural parameters. Here are some symbols for an analogous (but not fully general) setup:

[image: latex-image-1] https://cloud.githubusercontent.com/assets/1458824/7668778/ab7e1a3e-fc18-11e4-8ce8-2873eef24f24.png

(Apologies for any errors in that, which I typeset quickly, but I believe the general idea should be right!)

— Reply to this email directly or view it on GitHub https://github.com/mattjj/pyhsmm/issues/45#issuecomment-102721960.

mattjj commented 9 years ago

I admit I haven't tested mine thoroughly. It would be nice to write a VLB test with a comparison against an exhaustive enumeration, like for the likelihood code.