geoschem / integrated_methane_inversion

Integrated Methane Inversion workflow repository.
https://imi.readthedocs.org
MIT License
25 stars 19 forks source link

Bugfix preventing generation of one massive cluster to fill the grid #178

Closed laestrada closed 8 months ago

laestrada commented 8 months ago

Name and Institution (Required)

Name: Lucas Estrada Institution: Harvard ACMG

Describe the update

This PR fixes a bug that could occasionally happen at the final layer of clustering, where the clusters used to fill the grid would occasionally end up with an unexpectedly large number of state vector elements to assign due to variations in cluster sizes from k-means. This ends up causing the algorithm to assign all excess elements to a single cluster, potentially creating a massive "monster" cluster.

This fix effectively preempts that situation on the last layer of clustering. if the input number of labels to kmeans will exceed the number of clusters yet to be assigned, the algorithm will instead evenly distribute the final elements.