Confusing about the log likelihood

yuehhua commented 3 years ago

Thank you for making this great package. I used it in my project and I want to calculate likelihood for my own purpose.

I follow your documentation for calculating likelihood llpg(gmm::GMM, x::Matrix), but I got all positive values. I am confused about the outcome of llpg(gmm::GMM, x::Matrix). As document mentioned in README, it returns ll_ij = log p(x_i | gauss_j), the Log Likelihood Per Gaussian j given data point x_i. In theory, a log likelihood are all negative values, instead of positive values, while negative log likelihood are all positive. I am confused with these outcomes. Could you explain it clearly?

davidavdav commented 3 years ago

Hello,

likelihoods are in a way unscaled probabilities, and here we have continuous densities of the features. So the p(x_i | gauss_j) really is a probability density (because x_i are continuous), this is not bounded by 1. If you have small variance this will happen. I think with many Gaussians you always tend to have small variances (if the feature space is somewhat normalized) because you'll have a Gaussian covering a few data points that are close together.

Hope this explains things a little

yuehhua commented 3 years ago

Thank you for your explanation. I will check it in my data.

davidavdav / GaussianMixtures.jl

Confusing about the log likelihood #89