AmpersandTV / pymc3-hmm

Hidden Markov models in PyMC3
Other
94 stars 13 forks source link

Incorrect conditioning in `SwitchingProcess.random` #91

Closed jeffreyenos closed 2 years ago

jeffreyenos commented 2 years ago

In SwitchingProcess.random, there are two places where point is used to condition variables: distribution_subset_args and the component distribution's random method. distribution_subset_args substitutes values by matching the component distribution's formal parameter names to the keys of point, while the component distribution's random method tries to match on user-defined distribution and variable names. This can lead to confusion and / or incorrect conditioning when the formal parameter names and user-defined names are the same.

There is already a test where this issue can be demonstrated:

https://github.com/AmpersandTV/pymc3-hmm/blob/26f7009ab3e87570f0a56948a23b1f8d00bb22ed/tests/test_distributions.py#L428-L439

In the above, the Constant distribution used for the Poisson mean is given the user-defined name "c". However, the Constant distribution's formal parameter name is also c:

https://github.com/AmpersandTV/pymc3-hmm/blob/26f7009ab3e87570f0a56948a23b1f8d00bb22ed/pymc3_hmm/distributions.py#L441-L450

So in the above test, conditioning on the point {"c": 100.0} causes the Poisson mean and the value for the Constant component distribution (i.e. the zero state of the mixture) to be set to 100. That is, the below assertion:

assert np.all(test_sample[..., test_states == 0] == 0)

will fail.