bhmm / bhmm

Bayesian hidden Markov models toolkit
GNU Lesser General Public License v3.0
46 stars 14 forks source link

error in chi_square call #20

Open marscher opened 8 years ago

marscher commented 8 years ago
File "C:\projects\bhmm\bhmm\estimators\bayesian_sampling.py", line 161, in bhmm.estimators.bayesian_sampling.BayesianHMMSampler.sample
Failed example:
    samples = sampled_model.sample(nsamples, nburn=nburn, nthin=nthin)
Exception raised:
    Traceback (most recent call last):
      File "C:\Python27_32\lib\doctest.py", line 1315, in __run
        compileflags, 1) in test.globs
      File "<doctest bhmm.estimators.bayesian_sampling.BayesianHMMSampler.sample[5]>", line 1, in <module>
        samples = sampled_model.sample(nsamples, nburn=nburn, nthin=nthin)
      File "C:\projects\bhmm\bhmm\estimators\bayesian_sampling.py", line 168, in sample
        self._update()
      File "C:\projects\bhmm\bhmm\estimators\bayesian_sampling.py", line 196, in _update
        self._updateEmissionProbabilities()
      File "C:\projects\bhmm\bhmm\estimators\bayesian_sampling.py", line 258, in _updateEmissionProbabilities
        self.model.output_model._sample_output_model(observations_by_state)
      File "C:\projects\bhmm\bhmm\output_models\gaussian.py", line 417, in _sample_output_model
        chisquared = np.random.chisquare(nsamples_in_state-1)
      File "mtrand.pyx", line 2167, in mtrand.RandomState.chisquare (numpy\random\mtrand\mtrand.c:16888)
    ValueError: df <= 0
marscher commented 8 years ago

this happened on win32/py2 only.

jchodera commented 8 years ago

This is presumably happening because nsamples_in_state is zero. We should just raise an informative exception in this case.

jchodera commented 8 years ago

Actually, we should somehow prevent any state from becoming empty in the first place. Gibbs sampling updates should presumably be rejected if any state ends up empty.

jchodera commented 8 years ago

(This problem would not be unique to windows.)

marscher commented 8 years ago

Am 10.08.2015 um 19:51 schrieb John Chodera:

This is presumably happening because |nsamples_in_state| is zero. We should just raise an informative exception in this case.

— Reply to this email directly or view it on GitHub https://github.com/bhmm/bhmm/issues/20#issuecomment-129547923.

The question is, why it only occurs on 32bit systems.

jchodera commented 8 years ago

Is this in a nosetest, or is this in actual usage?

marscher commented 8 years ago

It is a doctest executed by nosetest.

franknoe commented 8 years ago

I can look into this when I'm back in the US (now in Singapur). Indeed states must be prevented from becoming empty, but the requirement is even a bit stronger: we need to have a trajectory that connects all states in the Gibbs sampling update. Once we loose connectivity, we cannot recover from it. I have solved this problem by using priors that ensure connectivity, but it may be that in this test the corresponding prior is not used.

It would be weird that it only occurs under Win32. Perhaps this was just by chance since it would depend on the random trajectory generated. I'll take care of it.

Am 10/08/15 um 19:52 schrieb John Chodera:

Actually, we should somehow prevent any state from becoming empty in the first place. Gibbs sampling updates should presumably be rejected if any state ends up empty.

— Reply to this email directly or view it on GitHub https://github.com/bhmm/bhmm/issues/20#issuecomment-129548199.


Prof. Dr. Frank Noe Head of Computational Molecular Biology group Freie Universitaet Berlin

Phone: (+49) (0)30 838 75354 Web: research.franknoe.de

Mail: Arnimallee 6, 14195 Berlin, Germany

jchodera commented 8 years ago

We can simply reject proposed trajectory sets that do not sample all states. As long as we don't start with an initial trajectory set that doesn't sample all states, this should be sufficient.

franknoe commented 8 years ago

Sampling all states is insufficient, we need connectivity if we want to use the reversible option.

I would also say that reject proposed trajectory sets that don't have the required structural properties is the most attractive option, which is why I have tried this first (ensuring connectivity rather than just visiting every state). Unfortunately that didn't work at all. I think the problem is that you can run into situations where the current estimate for the parameters becomes more and more unlikely to generate a connected trajectory, resulting in the Gibbs sampler getting stuck. I can have a look at this again when I have some time and try to understand it in detail, but for now using a prior that ensures connectivity at least works.

Am 22/08/15 um 16:36 schrieb John Chodera:

We can simply reject proposed trajectory sets that do not sample all states. As long as we don't start with an initial trajectory set that doesn't sample all states, this should be sufficient.

— Reply to this email directly or view it on GitHub https://github.com/bhmm/bhmm/issues/20#issuecomment-133709650.


Prof. Dr. Frank Noe Head of Computational Molecular Biology group Freie Universitaet Berlin

Phone: (+49) (0)30 838 75354 Web: research.franknoe.de

Mail: Arnimallee 6, 14195 Berlin, Germany