Closed Lchuang closed 1 year ago
Thanks for raising this bug! So when I run the same notebook, I get the result shown below, which plaids into the direction of our interpretation. I know that k-means works with a random initialization, which is why we specified random_state = 4
in the fourth cell of the notebook when creating an instance k-means from scikit-learn
. I am not sure if the problem comes from this place.
@ReneSteinmann, do you ming running the notebook for a check? I created a branch for working on this bug.
After running the series of notebooks for a few times, I think it actually may be due to the random_state
not being set in the FastICA in notebook 3_reduction.ipynb
Extract independent features
model = FastICA(n_components=10, whiten="unit-variance")
Once I set the random_state to a fixed value, the results became consistent.
Thank you for the hint of the random_state
for k-means! :D
Thanks for pointing this bug! We did not notice that random_state
was missing from the dimensionality reduction notebook. This should work better now indeed. Note that I changed the display of feature contribution into cluster, by taking the absolute value of the centroid. It exhibits better the "absolute" contribution.
In tutorial notebook 5_clustering
However, based on the plot, it seems like cluster 4 is mostly constrained by the features 1 or 3, And feature 4 seems to mostly constrain cluster 6. Was the order in the text reversed? Thank you. :)