Closed alexbw closed 10 years ago
Here are some preliminary thoughts.
Don't think of aDl as indexed by time; think of it as indexed by duration. The (i,j)th entry of aDl is log p( duration = i+1 | state = j). So those -infs aren't only relevant to the start of the state sequence.
I think you're on the right track: it's an instability in the possiblechangepoints code when confronted with -inf probabilities in the durations. Maybe that code can be made stable. I'll try loading that state and take a look.
Those -infs happen because you're using the IntNegBin Variant duration distribution. That duration distribution starts at r, meaning it has -inf log pmf values until r. I think that's bad and we shouldn't use it, and getting rid of the -infs would probably fix this issue (but cause others?). Options:
Here's what I see with the error:
/home/alexbw/Code/pyhsmm/internals/states.pyc in messages_backwards(self)
1209 np.logaddexp.reduce(betal[tblock:tblock+truncblock]
1210 + self.block_cumulative_likelihoods(tblock,tblock+truncblock,possible_durations)
-> 1211 + aDl[possible_durations-1] - normalizer,axis=0,out=betastarl[tblock])
1212 # TODO TODO put censoring here, must implement likelihood_block
1213 np.logaddexp.reduce(betastarl[tblock] + Al, axis=1, out=betal[tblock-1])
FloatingPointError: invalid value encountered in subtract
In [2]: debug
> /home/alexbw/Code/pyhsmm/internals/states.py(1211)messages_backwards()
1210 + self.block_cumulative_likelihoods(tblock,tblock+truncblock,possible_durations)
-> 1211 + aDl[possible_durations-1] - normalizer,axis=0,out=betastarl[tblock])
1212 # TODO TODO put censoring here, must implement likelihood_block
ipdb> print normalizer
[-inf -inf -inf -inf -inf -inf -inf -inf -inf -inf -inf -inf -inf -inf -inf
-inf -inf -inf -inf -inf -inf -inf -inf -inf -inf -inf -inf -inf -inf -inf
-inf -inf -inf -inf -inf -inf -inf -inf -inf -inf]
It's because the only possible duration is 1 (possible_durations is array([1])) and all the durations put zero probability on that.
The fix is to add in right_censoring, which I've never added to the PossibleChangepoints models. I'll figure out how to do that now.
Using the non-variant IntBegBin duration pyhsmm.distributions.NegativeBinomialIntegerRDuration
has solved the problem. What do you think is a good way to communicate this type of run-time incompatibility between model and duration distribution to the user?
Also, looking at the code for the variant and non-variant negative binomial integer durations, it seems like in a possiblechangepoint model, they shouldn't behave radically differently, so it hopefully will be an easy drop-in replacement. Anything else I should know about there, in terms of expected behavior differences in HSMMPossibleChangepoint models?
What do you think is a good way to communicate this type of run-time incompatibility between model and duration distribution to the user?
I think those assertions did their job in this case.
The issue with the variant durations is that they put -infs in their duration log pmfs. That interacted badly with the fact that I never implemented right-censoring for possiblechangepoints models. When right-censoring is implemented this particular blowup won't happen anymore.
As for other potential weirdness, the best thing to do is to read through the code and look out for TODO lines or TODO TODO lines (which I use to mean I really should do the thing in the comment).
Sorry, I take that back: this error should have resulted in a "zero probability under the model in this iteration" warning and there should have been some hack to forge ahead with the sampling process anyway (probably replacing the duration potentials with uniform potentials). Maybe some more info about what caused the zero probability issue (duration support intersected with possible changepoints was empty) would be better for the user, too.
That fix should take care of it, but I didn't test it with an intnegbinvariant duration to make sure the blowup goes away. If it's feasible to test this change on your end, can you reopen the issue if there's still an issue with the intnegbinvariants and possiblechangepoint models?
Sorry, not fixed yet.
Fixed and verified in 14bbd3c
I've saved a
HSMMStatesPossibleChangepoints
state which, when callings.messages_backwards()
produces first this warning —and then the following traceback —
The state is saved and available on Jefferson.
Observations
betal
andbetastarl
are almost entirely NaN, this is obviously the symptom we're working backwards from.Starting with the invalid value in subtract, warning, the arrays are —
aDl[possible_durations-1]
andnormalizer
. The values innormalizer
are all very small —The values in aDl are fine, except for the first seven entries, which are negative infinity —
I suppose this means that having a changepoints at the very beginning of the data sequence are impossible under the model, which would make sense for the first frame, but is puzzling that it continues into further frames.
Interestingly, spacing changepoints every 9, 10 or 100 frames works. However, spacing by 8 or fewer frames does not work, and results in the same first 7 frames containing
-inf
.Offsetting the first changepoint by providing changepoints as
zip(range(10,len(data),1), range(11,len(data)-1,1))
results in the same error.@mattjj, if I want to compare between possiblechangepoint and non-possiblechangepoint models, I need to be able to provide all frames as changepoints to the model. Any ideas as to why this might be happening?