Open qinhanmin2014 opened 5 years ago
In notebook 03-unsupervised-learning
X, y = make_moons(n_samples=200, noise=0.05, random_state=0) kmeans = KMeans(n_clusters=10, random_state=0) kmeans.fit(X) y_pred = kmeans.predict(X) plt.scatter(X[:, 0], X[:, 1], c=y_pred, s=60, cmap='Paired') plt.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], s=60, marker='^', c=range(kmeans.n_clusters), linewidth=2, cmap='Paired') plt.xlabel("Feature 0") plt.ylabel("Feature 1") print("Cluster memberships:\n{}".format(y_pred))
The book only provides the transformed feature and claims that we can now separate the two half-moon with linear models
distance_features = kmeans.transform(X) print("Distance feature shape: {}".format(distance_features.shape)) print("Distance features:\n{}".format(distance_features))
Maybe it's better to demonstrate how features derived from kmeans separate the two half-moon, see e.g.,
from sklearn.linear_model import LogisticRegression clf = LogisticRegression().fit(distance_features, y) xx = np.linspace(X[:, 0].min() - 0.5, X[:, 0].max() + 0.5, 100) yy = np.linspace(X[:, 1].min() - 0.5, X[:, 1].max() + 0.5, 100) XX, YY = np.meshgrid(xx, yy) X_grid = np.c_[XX.ravel(), YY.ravel()] X_grid_kmeans = kmeans.transform(X_grid) decision_values = clf.decision_function(X_grid_kmeans) plt.figure() plt.scatter(X[:, 0], X[:, 1], c=y, cmap='Paired') plt.contour(XX, YY, decision_values.reshape(XX.shape), levels=[0]) plt.show()
Thanks, that might indeed be a useful addition for the next print.
In notebook 03-unsupervised-learning
The book only provides the transformed feature and claims that we can now separate the two half-moon with linear models
Maybe it's better to demonstrate how features derived from kmeans separate the two half-moon, see e.g.,