AlexanderFabisch / gmr

Gaussian Mixture Regression
https://alexanderfabisch.github.io/gmr/
BSD 3-Clause "New" or "Revised" License
168 stars 49 forks source link

`is_in_confidence_region` always False for single-feature GMM #39

Closed nardi closed 2 years ago

nardi commented 2 years ago

It seems that I've run into a bug where for a GMM with a single feature the is_in_confidence_region always returns False, and so sample_confidence_region never terminates. I first noticed this with a conditioned GMM, but one constructed by hand shows the same behavior. Example:

import gmr
gmm = gmr.GMM(n_components=1, priors=[1], means=[[0]], covariances=[[[1]]])

# Works fine: gmm.sample(1)
# Never returns: gmm.sample_confidence_region(1, 1.0)

gmm.is_in_confidence_region([0], 1.0) # False
AlexanderFabisch commented 2 years ago

Thanks for reporting. It's certainly a bug either in the code or in the documentation. I'll have a look.

But may I ask in which case you might sample with alpha=1? In this case you can just use the normal sampling function.

AlexanderFabisch commented 2 years ago

The function is_in_confidence_region relies on the chi-square distribution of (number of features) - 1 degrees of freedom. In the case of one feature the degrees of freedom are 0, which results in an invalid chi-square distribution. I have to find an alternative way to compute whether a sample is in the alpha-confidence region in this case. Shouldn't be too difficult for a 1D normal distribution, but I don't know how this works at the moment.

AlexanderFabisch commented 2 years ago

Should be fixed with 1.6.2, which I just released at PyPI: https://pypi.org/project/gmr/1.6.2/