Create equal-sized clusters within kmodes

nicodv / kmodes

Python implementations of the k-modes and k-prototypes clustering algorithms, for clustering categorical data

MIT License

1.23k stars 416 forks source link

When using kmodes in Python, the algorithm chooses the best distribution of the data for a set number of clusters, regardless of the group size of each cluster. E.g.: some clusters can have around 10.000 data points while others have around 300.

For my project, I'd like to find an equal-sized clustering of all my data, where each cluster is constrained .

I've found an algorithm for constraining group size with a kmeans model, but since I'm working with categorical data, I need a solution for k-modes (or k-prototype).

Is there any solution or workaround of how I can get equal-sized clusters?

nicodv / kmodes

Create equal-sized clusters within kmodes #195