joshlk / k-means-constrained

K-Means clustering - constrained with minimum and maximum cluster size. Documentation: https://joshlk.github.io/k-means-constrained
https://github.com/joshlk/k-means-constrained
BSD 3-Clause "New" or "Revised" License
192 stars 43 forks source link

Fitting the k-means-constrained on training samples and predicting on test samples raises error #19

Closed bweill555 closed 2 years ago

bweill555 commented 2 years ago

Hi, I'm trying to fit the k-means-constrained on training samples and then call it to predict test samples. I am getting the following error message:

~\anaconda3\lib\site-packages\k_means_constrained\k_meansconstrained.py in predict(self, X, size_min, size_max) 708 raise ValueError("size_max must be larger than size_min") 709 if size_min * n_clusters > n_samples: --> 710 raise ValueError("The product of size_min and n_clusters cannot exceed the number of samples (X)") 711 712 labels, inertia = \

ValueError: The product of size_min and n_clusters cannot exceed the number of samples (X)

It seems there is not enough data in the testing sample to meet the clusters size constraints (here size_min) but is there a way to only apply the clusters sizes constrains in the fitting process and not in the prediction one?

joshlk commented 2 years ago

Hey, thanks for using k-means-constrained! Can you give show me the full code example of what you are doing and the shapes of the input data?

joshlk commented 2 years ago

I’m going to close the issue due to inactivity. @bweill555 feel free to reopen if you are still having issues