Open bbengfort opened 6 years ago
Greetings! Can I use it via Anaconda? Can't install 0.9 version of yellowbrick in order to use InterclusterDistance. If I can't - tell me, please, if there is another easy way to find intercluster distance of sckit's k-means. Thanks!
@jaywalkingbackwards we haven’t deployed v0.9 to anaconda yet. It is one of our highest priorities. I am not aware of a different way to find intercluster distance. Have you taken a look at our code? https://github.com/DistrictDataLabs/yellowbrick/blob/develop/yellowbrick/cluster/icdm.py
@bbengfort or @rebeccabilbro any comments?
@jaywalkingbackwards version 0.9 has been released to conda - if you update your Yellowbrick install you should have access to ICDM now!
The
InterclusterDistance
visualizer is our newest cluster visualization, and while it's been implemented completely, there are still a few updates I'd like to make to it:Notes on colors
Right now the facecolor of the clusters is hard coded to
#2e719344
and the edgecolor of the clusters is hard coded to#2e719399
note the44
and99
on the colors respectively, these set the opacity of the color; the edge is more opaque than the face of the cluster in order to allow better visibility of clusters that overlap.I would like to support the user specifying a color for all clusters or a colormap/colors for each cluster as well as the ability to specify the face opacity. If the user specifies these things, then we have to compute the relative alpha (opacity) for both the edge and the face to maintain the currently hardcoded behavior.
Notes on supported algorithms
Right now we use the
cluster_centers_
attribute of the model to embed the centers into 2 dimensional space and thelabels_
attribute to score/size the clusters. Unfortunately, not all clustering algorithms have these attributes, so we need to extend thecluster_center_
property on the visualizer to either find a different attribute or to compute the cluster centers some how. Below is a listing of various clustering algorithms and their attributes.We would like to ensure support for the following clustering algorithms:
It would be great if we could find support for the following clustering algorithms, but it's not clear if it's possible or not either because there is no obvious centers or labels:
We already have support for the following clustering algorithms (using the
cluster_centers_
attribute for embedding and thelabels_
attribute for scoring):