dm13450 / dirichletprocess

Build dirichletprocess objects for data analysis
https://dm13450.github.io/dirichletprocess/
58 stars 14 forks source link

constrain clusters to have common parameters? #17

Open tdhock opened 4 years ago

tdhock commented 4 years ago

hi @dm13450 first of all thanks for the great JSS article / vignette about dirichletprocess, which is super helpful. I am using it for teaching a CS class about unsupervised learning algorithms this semester. I especially like how in the vignette it explains how to implement your own mixture models (Poisson example). However it was not clear whether or not it is possible to constraint a parameter to have a common value across clusters. For example I would like to implement something similar to mclust::Mclust(modelNames="E") which enforces equal variance in univariate gaussian mixture models. Is that possible? I see that Likelihood.normal is defined as dnorm(x, theta[[1]], theta[[2]]), and I would like to instead use dnorm(x, theta[[1]], common_variance_param), where common_variance_param is used for all clusters, and it is also inferred from the data.

dm13450 commented 4 years ago

Hi Toby, thanks for using the package and using it to teach!

Yes, it is possible with a few functions. I've just pushed to the master branch the first half of the implementation, the mixing distribution functions for a 1D gaussian with fixed variance. Now to infer to the variance parameter you'll have to extend the fit function. You'll need an UpdateSigma function that can be included in the Fit function.

There's a notebook here: http://dm13450.github.io/assets/DirichletProcessCommonVariance.Rmd that hopefully shows you where the UpdateSigma function should be included.

Hope this helps and please reach out if you need more help