mattjj / pyhsmm

MIT License
546 stars 173 forks source link

Discrete distribution for observables? #57

Closed allista closed 8 years ago

allista commented 8 years ago

I understand this question is based on of my incompetence, but is it possible to use your hsmm models with fixed discrete distributions of observables and fixed semi-Markov chains to infer only duration distributions' parameters and transition matrix?

What I'm after is to build a model of a short pattern of hidden states with durations from a set of sequences of observables and then to search that pattern within a much longer sequence of observables.

mattjj commented 8 years ago

Sorry for the delay; I forgot to respond to this one when it popped up in my email.

Yes, it is possible to fix the observation distributions' parameters. You could write an observation distribution class in which its update step (like resample or meanfieldupdate, depending on the algorithm you're using) is a no-op. You could also extend an existing observation distribution class or use the _FixedParamsMixin to do the same. Finally, you could also extend a model class to make it only resample (or otherwise update) the transitions and durations. I'd probably handle it in the observation class.

There are a few discrete distributions in pybasicbayes, but if you want a distribution with finite support, you're probably interested in Categorical (or maybe Multinomial).

Searching for a pattern in a much longer sequence (like a regex) is interesting, and it's not something I've thought about. One word of warning is that HSMMs (as opposed to HMMs) tend to have inference that scales badly with sequence (or window) length: while HMM inference is linear in the sequence length, generally HSMM inference is quadratic. For some classes of duration distributions HSMM inference can be done in linear time; the most general class I know about is duration distributions with rational probability generating functions, as I wrote about in Ch. 4 of my PhD thesis. Finite-support duration distributions are a special case and might be what you're interested in.

Let me know if you have any other questions (or if I missed the mark with this one). Good luck!