pmelchior / pygmmis

Gaussian mixture model for incomplete (missing or truncated) and noisy data
MIT License
98 stars 22 forks source link

fix component #1

Closed jhmarcus closed 7 years ago

jhmarcus commented 7 years ago

Hello,

Great software and paper! I was wondering if you have implemented the ability to fix some of the parameters of one of the components (mean and covariance) while estimating the parameters of the others via EM. This would be very useful for one of my research projects.

see https://github.com/jobovy/extreme-deconvolution/issues/9

Thank you,

Joe

pmelchior commented 7 years ago

Yeah, it's doable and implemented, but I haven't exposed the functionality to the fit method yet. I'll have a go at it.

jhmarcus commented 7 years ago

Thank you!

pmelchior commented 7 years ago

You can now set an iterable of component indices with the frozen keyword. Make sure you initialize them properly before ...

jhmarcus commented 7 years ago

Thats great! Just to be clear I was hoping to fix the mean and covariance matrix of a single component (not the mixture proportion). Is that doable within the iterable you setup?

pmelchior commented 7 years ago

That's exactly what will happen. The mixture amplitudes will be adjusted but not the means or covariances of the frozen components

jhmarcus commented 7 years ago

great. Thanks @pmelchior.

jhmarcus commented 7 years ago

@pmelchior last question. For some components I also need to just fix the means and estimate the covariance matrices and mixture proportions. Would that possible with the frozen argument? I can also dig into the code if this too specific of an application. I appreciate the help.

pmelchior commented 7 years ago

It's not possible yet, but it at least doesn't break the correctness of the EM. It shouldn't be hard. Hang on.

pmelchior commented 7 years ago

frozen can now be a dictionary: frozen = {'mean': [1,2,3], 'covar': []} should do what you want.

jhmarcus commented 7 years ago

perfect thank you so much!

jhmarcus commented 7 years ago

hmm I don't think the mixture proportions are being updated ... see the run em section here:

https://github.com/jhmarcus/pashpy/blob/master/notebooks/pyggmis_example.ipynb

pmelchior commented 7 years ago

That's because of your setup. initFromDataAtRandomFixed yields three identical components, with one exception: the covariance of component 0. This renders components 1 and 2 indistinguishable. You shouldn't do that because they will receive identical updates.

Because you freeze the means of all, and the covariance of 0, the amplitude of component 0 won't be changed under the current scheme (I'm about to change that). Because components 1 and 2 are identical, they can only move in unison, but their sum is fixed. Hence nothing changes.

I'll have a new version where you can set freeze the amplitudes as well as desired. That takes away my assumption on which ones are to be updated in special cases like yours. In the meantime, modify you init function to do something like:

def initFromDataAtRandomFixed(gmm, data, covar=None, rng=np.random):
    gmm.amp[:] = 1./gmm.K
    gmm.mean[:,:] = 0
    gmm.covar[0,:,:] = 1e-6 * np.eye(gmm.D)
    gmm.covar[1,:,:] = 1 * np.eye(gmm.D)
    gmm.covar[1,:,:] = 2 * np.eye(gmm.D)

Also, in your case, you seem to only do one run, so you can avoid all the lists of gmms and likelihoods and simple do

frozen={'mean':[0,1,2], 'covar':[0]}
fitted = pygmmis.GMM(K=K, D=D)
cutoff = None
l, u = pygmmis.fit(fitted, data, init_callback=initFromDataAtRandomFixed, w=w, cutoff=cutoff, rng=rng, frozen=frozen)
pmelchior commented 7 years ago

Try frozen={'mean':[0,1,2], 'covar':[0], 'amp': []} now.