Doubt: Observational distribution

jmgo commented 8 years ago

Hi!

I want to use distribution with arbitrary shape for the observational distributions of a HMM. (arbitrary shape - referring to a distribution that does not have a particular function to describe it, but rather an array of probabilities for all possible values of the feature)

To do that can I use the Categorical distribution? Or is there some other way that is more appropriate?

Regards, jmgoxx

slinderman commented 8 years ago

If your observations are discrete random variables, like

then yes, you could model them with a categorical distribution with state-dependent probabilities. For example,

(this is a slight abuse of notation since we are representing the observations as integers rather than one-hot vectors, but the idea is the same.)

This is the standard way to handle discrete observations with an HMM, and the Categorical distribution would serve as the observation distribution. Does that answer your question?

jmgo commented 8 years ago

Hi!

Thank very much for the answer.

My idea is to use a classifier to calculate the state-dependent probabilities using a classifier. Do you think creating a "classifier-based distribution" would work (see example below)? What I mean is to create a distribution that whenever it is called it uses the classifier, instead of some function, to return the probability. (In this case I wouldn't use any optimization of the HMM, I would just set the probabilities of the HMM and use the Viterbi algorithm )

Would something like this example work? (the example was not tested, there should be some mistakes)

class ClassifierDist(object): 
     def __init__(clf)
        self.clf = clf
     def log_likelihood(self,x):
        prob = self.clf.predict_proba(x) # get prob from classifier
        log_prob = np.log(prob)            # convert to log_likelihood
        return log_prob

Or using the Categorical distribution is easier? Each category would be a possible combination of features (a lot of categories would be created).

Regards, jmgoxx

mattjj commented 8 years ago

Yes, that makes sense; in fact, I'd say it'd be a CRF! I don't think we've worked with CRFs in pyhsmm, but it's a good idea to explore. You should be able to reuse the message passing routines in the States classes, and the strategy you're outlining should accomplish that. A lot of other things should work, though you may want to implement a max_likelihood or resample method to update the parameters of those classifiers (if indeed you want to fit them jointly with the rest of the model instead of just leaving them fixed).

Does that make sense? If it doesn't, and if you really want to pursue this using pyhsmm, I can say a bit more in detail.

jmgo commented 8 years ago

Thanks for the answer!

Meanwhile, I implemented something similar to the example and it worked. I leave here the code if someone has the same issue (using a classifier from sklearn):

import numpy as np
class ClassifierDist(object):

    def __init__(self, clf, state):
        self.clf   = clf
        self.state = state 

    def log_likelihood(self,x):
        prob = self.clf.predict_proba(x)[:,self.state]
        prob_log = np.log(prob)
        return prob_log

I think that I will only need to optimize the parameters of the classifier first and then use those fixed parameters. If I need the optimize them jointly, I'll ask again.

Best Regards, jmgoxx

mattjj commented 8 years ago

Glad to hear it, and thanks for the code. I'll close the issue for now, reopen or start a new one if you want to discuss further!

mattjj / pyhsmm

Doubt: Observational distribution #67