Open GoogleCodeExporter opened 8 years ago
Loki asked me how to do vector quantization with hierarchical clustering, and
requested a code example. Below is my reply to his request. I have changed the
category from Type-Defect to Type-Enhancement.
Damian
------------------------------------------
I don't have much time right now so this note will be quick. The
example code below generates 3 random gaussians of 100 points randomly
centered in a 10 by 10 unit region. They are clustered with centroid
linkage then the hierarchy is cut into flat clusters with fcluster.
The members of each cluster are then used to compute centroids.
If you are trying to do vector quantization, you may find k-means
easier to work with. If you don't know the number of codes in the code
book (clusters) a priori, you might try QT clustering or Mean Shifting
with Kernel Density Estimation. k-means is in scipy and QT
clustering+mean shifting will be integrated into scipy-cluster this
summer.
You might try using kd-trees to get better performance out of
membership lookups. The ANN scikit has a good implementation.
I hope this helps.
Cheers,
Damian
import numpy as np
import matplotlib.pylab as mpl
import hcluster
nc = 3
ppc = 100
X = np.random.randn(nc*ppc,2) * 0.5
for i in xrange(0, nc):
shift = np.random.rand(2) * 10.0
print shift
X[i*ppc:(i*ppc+ppc), :] += shift
# plot the gaussians
mpl.plot(X[:,0], X[:, 1], 'bo')
mpl.show()
# perform centroid linkage
Z = hcluster.linkage(X, 'centroid')
# flatten the hierarchy into flat clusters.
labels = hcluster.fcluster(Z, nc, 'maxclust') - 1
# print the labels returned.
print labels
centroids = np.zeros((nc, 2))
# for each cluster, compute its centroid based on the labels vector.
for i in xrange(0, nc):
centroids[i, :] = X[labels == i].mean(axis=0)
# plot the gaussians
mpl.plot(X[:,0], X[:, 1], 'bo')
mpl.plot(centroids[:, 0], centroids[:, 1], 'ro')
mpl.show()
Original comment by damian.e...@gmail.com
on 23 May 2008 at 1:11
Original issue reported on code.google.com by
loki.dav...@gmail.com
on 21 May 2008 at 7:08