joshlk / k-means-constrained

K-Means clustering - constrained with minimum and maximum cluster size. Documentation: https://joshlk.github.io/k-means-constrained
https://github.com/joshlk/k-means-constrained
BSD 3-Clause "New" or "Revised" License
192 stars 43 forks source link

Is it possible to implement MiniBatchKmeansConstrained? #54

Open pengzhenghao opened 7 months ago

pengzhenghao commented 7 months ago

My dataset is too large that it takes very long time to converge. Do you have any suggestion for my usecase? Thanks!

joshlk commented 3 weeks ago

I would suggest pre-segmenting that data. For example, if each row is a person, segment on gender or location. Or you can use normal K-Means first to get big cluster and then use KMeansConstrained to get smaller ones