kassambara / factoextra

Extract and Visualize the Results of Multivariate Data Analyses
http://www.sthda.com/english/rpkgs/factoextra
358 stars 103 forks source link

"fviz_dend" can not show all the samples in the cluster dendrogram #56

Open zhuxqdoctor opened 6 years ago

zhuxqdoctor commented 6 years ago

Hi, I am using the example data "USArrests" for hierarchical k-means clustering and further visualize my own data. however, it seems that the "fviz_dend" does not work perfect and it can not show right number of the sample names in some of the tree cluster. For example, in "USArrests" example data, the res.hk$size showed a cluster got 8 samples, but only 7 samples were shown in using fviz_dend. Does anyone can figure this out. Thanks!

kassambara commented 6 years ago

Hi,

The algorithm of hkmeans method is as follow:

  1. Compute hierarchical clustering
  2. Cut the tree in k-clusters
  3. compute the center (i.e the mean) of each cluster
  4. Do k-means by using the set of cluster centers (defined in step 3) as the initial cluster centers. Optimize the clustering

This means that the final optimized partitioning obtained at step 4 might be different from the initial partitioning obtained at step 2.

The fviz_dend() functions shows the initial partitioning result (step 2). Consider mainly the plot created by the function fviz_cluster(). This has been now clarified in the documentation

zhuxqdoctor commented 6 years ago

Thanks for answering my question! Now I want to visualize all the samples in the cluster dendrogram obtained the the initial step 2. Do you have any suggestions about the way I can use to visualize the cluster in dendrogram instead of fviz_cluster?