Closed 0x10cxR1 closed 7 years ago
If I know the emission distribution is gaussian, how the model fitting will be? Does it will fit to such specific distribution or not?
Hi @Zhijing1128, model fitting will stay in the selected class of emission distributions. That is, if your emission PDF is a Gaussian, for example, then the Baum-Welch procedure will adjust the parameters of the emission PDF (mean and standard deviation), but it won't change the kind of PDF.
Hi @jvkersch, What if there's no assumption of the emission distribution? How can I use the code in this way?
I know nothing about R. I'm confused about the Baum-Welch implementation. I saw you used the forward backward function in R, and the do the M step using python. All of these are in the fit function. Am I right? For the _fb_impl(), the variables updated here, for example n, l, ... none of them are global variables. What the _fb_impl() returns is just err. I don't understand this part. How do you used the variables updated in the fb function?
I appreciate if you can explain more about it.
@Zhijing1128 Right now, only Gaussians and multinomial distributions are supported. If your emission distribution is nonparametric, you can use the latter. E.g. here I'm starting from a totally non-informative HSMM, I generate 10 random observations, and then fit the model to it. There isn't much signal in this noise, but it should illustrate the api:
In [1]: from hsmmlearn.hsmm import MultinomialHSMM
...: import numpy as np
...:
In [3]: durations = np.full((3, 4), 0.25)
...: tmat = np.ones((3, 3), dtype=float)
...: tmat /= tmat.sum(axis=1, keepdims=True)
...: emission_probs = np.ones((3, 10), dtype=float)
...: emission_probs /= emission_probs.sum(axis=1, keepdims=True)
...: hsmm = MultinomialHSMM(emission_probs, durations, tmat)
...:
In [4]: obs = np.random.randint(0, 10, 100)
In [5]: hsmm.fit(obs)
Out[5]: (True, -665.92278076721868)
In [6]: hsmm.probabilities
Out[6]:
array([[ 7.10118622e-04, 8.39231099e-04, 5.16449907e-04,
9.94060826e-01, 5.16449907e-04, 6.45562384e-04,
1.29112477e-04, 1.03289981e-03, 6.45562384e-04,
9.03787337e-04],
[ 7.10118622e-04, 8.39231099e-04, 5.16449907e-04,
9.94060826e-01, 5.16449907e-04, 6.45562384e-04,
1.29112477e-04, 1.03289981e-03, 6.45562384e-04,
9.03787337e-04],
[ 7.10118622e-04, 8.39231099e-04, 5.16449907e-04,
9.94060826e-01, 5.16449907e-04, 6.45562384e-04,
1.29112477e-04, 1.03289981e-03, 6.45562384e-04,
9.03787337e-04]])
I know nothing about R. I'm confused about the Baum-Welch implementation. I saw you used the forward backward function in R, and the do the M step using python.
Not quite; there is no R involved. The low level is in C++ and computes the forward and backward probabilities (via the _fb_impl
function in Cython). The update step is done in Python, since this needs information about the particular form of the model.
However, there should be no need to use this level of the api directly, normally everything should be done via the .fit()
function on the HSMM.
For the _fb_impl(), the variables updated here, for example n, l, ... none of them are global variables. What the _fb_impl() returns is just err. I don't understand this part. How do you used the variables updated in the fb function?
_fb_impl
updates the array parameters that get fed in (d_para
, p_para
, etc) in-place.
@Zhijing1128 I'll close this as solved, but feel free to reopen or add other issues if anything else comes up.
Hi,
while reading your code, I wonder is there a requirement of the emission distribution in the model fitting? If I know the emission distribution is gaussian, how the model fitting will be? Does it will fit to such specific distribution or not? It seems to me that there's no such requirement.
Just want to make sure.