Closed czhang03 closed 9 years ago
@mleblanc321 what do you think
show me the written description, that is what i want; i encourage you to use the material i sent, including the ranges of strength of cohesion
The document for scipy is not really helpful, but you might get some help from their source code about fcluster to see how the clusters are formed: https://github.com/scipy/scipy/blob/v0.15.1/scipy/cluster/hierarchy.py#L1446
I have seen the source code, it is just basically calling other functions and if you track the function, you will found that it is just calling other functions...
It is not that helpful too
If you want me to talk about silhouette_score I can do that, but for flat cluster I am not sure we can really explain that clearly. We can copy and paste the document there, but that will not help the user understand that at all.
And the main problem is silhouette_score has no meaning without knowing how the cluster is divided
I and Austin has spend a long time on that. the document of
scipy
is terrible there are variable that cannot find any reference to.for now we can print from the back-end how the cluster is divided, but we cannot figure out its behaviour. It seems like it is unwilling to divide some of the segments in different cluster not matter what setting you give it.
I think for now, we can give the user how this is divided so the user can know what they are doing.
For future I think we can rewrite the algorithm for silhouette score, and we can let the user to customize the cluster and choose how deep they want to go into the dendrogram.
Silhouette_score algorithm is not hard to write anyway.