Wrong clustering - Githubissues

Hello, I have a bunch of house points. I have extended a line which extends Glass fiber to each house. Now, I'd like to cluster the points to assign a distributor to each cluster. The maximum of house points per cluster should be 20. I calculated an adjacency matrix based on the Glass fiber line for the data set which has 61 points, here as an example. I do the clustering by this library and the pre-computed adjacency matrix. However, I sometimes see a wrong clustering, which is observable in the picture.

It is my code: "am" is the adjacency matrix of distances db = KMeansConstrained(n_clusters = 4,size_max=20, random_state=0) result = db.fit_predict(am)

In the picture, the black line is the glass fiber line which is the base of calculation and the colorful points are the my clustered points by the algorithm. The green and yellow clusters are not in the best state, as you see. I have sometimes the same issue with other datasets, as well.

kmeans constrained

I appreciate any help to improve the result.

Versions:

Python: 3.9
Operating system: Windows
k-means-constrained: 0.7.2
numpy: 1.23.2
scipy: 1.9.1
ortools: 9.4.1874
joblib: 1.1.0
cython (if installed): is not installed

Best regards, Mostafa

joshlk / k-means-constrained

Wrong clustering #34