hyunsooseol / snowCluster

This module allows users to analyze k-means & hierarchical clustering, and visualize results of Principal Component, Correspondence Analysis, Discriminant analysis, Decision tree, Multidimensional scaling, Multiple Factor Analysis, Machine learning, and Prophet analysis.
http://www.sthda.com/english/wiki/factoextra-r-package-easy-multivariate-data-analyses-and-elegant-visualization
8 stars 2 forks source link

more cluster centers than distinct data points with Jamovi and Snow Cluster K-means #16

Closed swahull closed 8 months ago

swahull commented 1 year ago

This is a request for advice on how to overcome an issue I am having with Snow Cluster K-means clustering. I am working with a dataset of 5800+ stores, where previously I have run PCA Analysis which distilled down to 4 resulting components, representing Population, Income, Store Size, and State/Climate. I am also including Sales Volume as a further variable into the clustering mix. The component values were saved to the store data. There are 2 issues, firstly that optimal number of clusters always seems to be 1, which makes no sense given the diversity of stores. However the main issue is that I get the error "more cluster centers than distinct data points" whenever the number of clusters selected exceeds the number of variables entered (5), so it works fine selecting 2, 3, 4, 5 clusters, but if I enter 6 for the number of clusters, I immediately get this error. Please advice on what might be causing this problem so I can fix it. Source data is available if requested.