pmelchior / pygmmis

Gaussian mixture model for incomplete (missing or truncated) and noisy data
MIT License
98 stars 23 forks source link

Empty "U" / responsibilities from "fit" #16

Closed jpfeuffer closed 4 years ago

jpfeuffer commented 4 years ago

Hi!

When running pygmmis.fit on my data, it fits the data nicely but I do not get any resulting responsibilities for the components. logL, U = pygmmis.fit(gmm, coords)

results in U = [None, None]

Do you have any idea on what that could be, or how to debug, or how to workaround?

macOS 10.15, Python 3.7, pygmmis 1.1.4

pmelchior commented 4 years ago

U is only set if cutoff is not None. This cutoff is a distance in standard deviations from the center of any component, so if it isn't set to something like cutoff=5, the neighborhood cannot be defined.

Just to clarify, the purpose of this feature is not to find the cluster assignment, rather to deal with outliers and to make the fitting with lots of components more efficient.

I hope this helps.

jpfeuffer commented 4 years ago

Ah, thanks. Sure, that explains. Stupid question then: How can I get the posterior probabilities for the cluster assignments for the converged set of parameters without a cutoff? Is U the result that I want? Should I maybe add an arbitrarily high cutoff?

pmelchior commented 4 years ago

The GMM class has the methods logL for the summed posterior log probability log p(x) and logL_k for the individual component posterior log p(x|k).