I have found that --for some truncated models-- the log-likelihood decreases immediately from the starting point and EM steps taken by GMM.fit can make things worse.
If I remove the convergence conditions for these models and just run until maxiter is reached, the EM steps always result in a lower logL.
I was under the impression that EM is supposed to guarantee that the likelihood increases.
For the model I use here (where green is the ground truth):
If I keep the convergence checks, the model doesn't move away from the k-means result (i.e. no steps are taken since the logL decreases). This is shown in black
if I remove the convergence checks but set maxiter=300, then I get something that appears reasonable (shown in red).
-But this is arbitrary, if I allow it to continue then it degrades into rubbish (the result after 700 steps more is shown in blue)
So EM appears not to work for this model, the likelihood always decreases despite the fact that I gave it the k-means estimate based on all data, not just the observed.
This doesn't happen for all models. If I use a less aggressive truncation, then the model stablises as it should:
================
I discovered this effect after implementing my own convergence detection technique to deal with problems such as the one I've seen in #11.
The log-likelihood decreases but then goes back up higher than before so the original convergence checks would terminate early.
My plan was to fix this with a gradient test (my feature/convergence - #12 ) branch (which also includes tools to visualise what's happening and a backend to store the EMSteps). This method has fixed that problem but it can't solve this one.
I have found that --for some truncated models-- the log-likelihood decreases immediately from the starting point and EM steps taken by
GMM.fit
can make things worse. If I remove the convergence conditions for these models and just run untilmaxiter
is reached, the EM steps always result in a lowerlogL
. I was under the impression that EM is supposed to guarantee that the likelihood increases. For the model I use here (where green is the ground truth):logL
decreases). This is shown in blackmaxiter=300
, then I get something that appears reasonable (shown in red). -But this is arbitrary, if I allow it to continue then it degrades into rubbish (the result after 700 steps more is shown in blue)So EM appears not to work for this model, the likelihood always decreases despite the fact that I gave it the k-means estimate based on all data, not just the observed.
This doesn't happen for all models. If I use a less aggressive truncation, then the model stablises as it should:
My model is here and produces the plots above.
================ I discovered this effect after implementing my own convergence detection technique to deal with problems such as the one I've seen in #11. The log-likelihood decreases but then goes back up higher than before so the original convergence checks would terminate early.
My plan was to fix this with a gradient test (my
feature/convergence
- #12 ) branch (which also includes tools to visualise what's happening and a backend to store the EMSteps). This method has fixed that problem but it can't solve this one.Comments and suggestions would be really helpful!
Thanks