exhuma / python-cluster

Simple clustering library for python.
GNU Lesser General Public License v2.1
65 stars 27 forks source link

[Patch] Topological output and minor changes #9

Closed exhuma closed 11 years ago

exhuma commented 11 years ago

Converted from SourceForge issue 1535137, submitted by ajaksu2

Hi Michel,

Pretty useful project you have here, I haven't read the clustering algorithms yet but have used it (with ecological similarity indexes among samples!). Hierarchical clustering makes that a LOT more useful, so thank you very much :)

This patch adds a topological output to Cluster(.topology()) and a respective method to BaseClusterMethod(.topo()). It also changes .data() behavior to avoid the .data()[0] idiom.

Feeding the output of .topo() to http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/139422 gives (e.g.):

9 -----+ --+
8 -----+
# --+
|--+  |

w --+ | | |--+ e -----+

Regarding the .data() change: * I think it'd be better to keep two different datasets (.raw and .data?), because both states (flat and clustered) are pretty different beasts. * Perhaps having .data() "contents" stored as .data would make things easier (in a "why call a method with no parameters, that returns an object" way :)). Making .data a property would keep current behavior and require less ()s :)

I'm working on an advanced version of the topological view and on a derivative of that printing recipe for my own needs, but will forward them here when done.

Thanks again, Daniel

exhuma commented 11 years ago

Submitted by ajaksu2

Logged In: YES user_id=1200609

Duh, of course SF mangles the ASCII art. Here's something I'm getting from very ugly, very poor code. It's based on the recipe above, plus handling of tags (levels) and a hack that lets Cluster.topology() know that I want a label instead of each item's repr.

exhuma commented 11 years ago

Submitted by exhuma

Logged In: YES user_id=560690

Hi Daniel,

thanks for the input. It gives me the warm and fuzzy feeling, that this project is indeed still in use ;)

Having topological output for the clusters is a brilliant idea. I remember well, a couple of weeks ago. I was sitting at the desk with several A4 pages taped together, and was drawing the output of python-cluster for debugging. Hehe... ;)

Having this is very useful. I had a quick look through your code, and it looks fine. I will merge it as soon as I get the time.

Thanks a lot,

Michel

exhuma commented 11 years ago

Submitted by exhuma

Logged In: YES user_id=560690

Bah, I accidentally updated an old revision with this patch. Need to re-do the work :(