AdvancedPhotonSource / tike

Repository for ptychography software
http://tike.readthedocs.io
Other
29 stars 15 forks source link

NEW: Implement compact clustering methods for mini-batch selection #180

Closed carterbox closed 2 years ago

carterbox commented 2 years ago

Purpose

Related to #145. Implements an algorithm for compact batch selection.

Approach

Uses a modified k-means clustering algorithm which limits the cluster sizes to be approximately equal. Starts with kmeans++ to initialize the clusters, then cycles through the points trying to swap them such that the total distance from point to cluster centroid is minimized. This swapping heuristic technically does not minimize the kmeans objective, but it does a good job of creating clusters without enclaves.

Pre-Merge Checklists

Submitter

Reviewer

pep8speaks commented 2 years ago

Hello @carterbox! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! :beers:

Comment last updated at 2021-12-01 17:13:15 UTC
stevehenke commented 2 years ago

This feature introduces a clustering approach to mini-batch selection. The code is clean and tests are passing. Thank you for your contribution!