DistrictDataLabs / cultivar

Multidimensional data explorer and visualization tool.
http://trinket.districtdatalabs.com
Apache License 2.0
52 stars 18 forks source link

Implement Hierarchical Clustering #29

Closed DataFighter closed 8 years ago

DataFighter commented 8 years ago

We need to have some implementation of clustering that we can lose to build other features on .

Hierarchical is useful because it is ideal for brushing, zooming, and filtering.

rebeccabilbro commented 8 years ago

Use agglomerative clustering in scikit-learn

rebeccabilbro commented 8 years ago

With the distance function, run general distance and optimize. Calculate minimum average distance or highest average distance, then use simulated annealing to find the optimal distance.