mlampros / ClusterR

Gaussian mixture models, k-means, mini-batch-kmeans and k-medoids clustering
https://mlampros.github.io/ClusterR/
84 stars 29 forks source link

Optimal_Clusters_GMM warning number of columns #45

Closed FMKerckhof closed 1 year ago

FMKerckhof commented 1 year ago

Not sure if a bug or intended behavior, but when using ClusterR::Optimal_Clusters_GMM I get a warning the number of columns of the data should be larger than 'max_clusters' triggered by: https://github.com/mlampros/ClusterR/blob/d9ff519625c62ee8d44910f52dfa9023af3d6b1a/R/clustering_functions.R#L248

However, from the examples I would assume we are trying to cluster observations rather than parameters? Hence shouldn't the number of rows be used to trigger this warning?

mlampros commented 1 year ago

@FMKerckhof yes that's true, based on the Armadillo documentation

The k parameter indicates the number of centroids; the number of samples in the data matrix should be much larger than k

give me a few days and I'll fix both the code and the documentation of the package

mlampros commented 1 year ago

I just updated the code, now it should show the warning if the number of clusters are bigger than the number of observations. I'll close the issue for now, feel free to re-open it in case that the code does not work as expected.