Open shouldsee opened 6 years ago
@guofei9987 Do you fancy some GMM in your scikit-opt?
@shouldsee 看看这里怎么样http://scikit-learn.org/stable/modules/generated/sklearn.mixture.GaussianMixture.html
@guofei9987 sklearn我已经在用了,现在主要考虑mixture之间的冗余性的问题,想用hierarchical clustering解决,但是没看到好用的metric
def KLD_cov(u_cov,v_cov):
out = (- np.log(sum(u_cov)) + np.log(sum(v_cov)) + sum(v_cov/u_cov))/2
return out
def KLD_mu(diff_mu,v_cov):
out = np.sum(diff_mu ** 2 / v_cov)
return out
def KLD_both(u,v):
L = len(u)
u_cov = u[:L//2]
v_cov = v[:L//2]
u_mu = u[L//2:]
v_mu = v[L//2:]
diff_mu = v_mu - u_mu
out = 0
out += KLD_cov(u_cov,v_cov) - L//2
out += KLD_cov(v_cov,u_cov) - L//2
out += KLD_mu(diff_mu,v_cov)
out += KLD_mu(-diff_mu,u_cov)
return out
if 1:
sort_idx = np.argsort(dpgmm.means_[:,0])
COV = dpgmm.covariances_[sort_idx,:]
MEAN = dpgmm.means_[sort_idx,:]
obs = np.hstack([COV,MEAN])
D = distance.pdist(obs, KLD_both)
APOLOGY: I don't have Chinese input on this computer.
Motivation
Given the popularity of gaussian mixture model (GMM), it would be handy to have some tools to visualise the redundancy between fitted gaussians.
Objective
Implementing some common F-divergence metrics between multivariate gaussians (given N(\mu, \Sigma)) :
Simplification
Quite often the covariance matrix \Sigma is truncated to diagonal only, thus it's preferred to have a fast version for these "diagonal Gaussian" that avoids expansion of full covariance matrix.
Comment
Please do note me if there is any existing implementation.