GiulioRossetti / cdlib

Community Discovery Library
http://cdlib.readthedocs.io
BSD 2-Clause "Simplified" License
375 stars 71 forks source link

markov clustering listed as edge clustering rather than node clustering #83

Closed micans closed 4 years ago

micans commented 4 years ago

Markov clustering (mcl) is listed as an edge clustering algorithm in the documentation MCL is a node clustering algorithm. While it is true that there is an interpretation step of the sparse limit, this interpretation is most logically as a separation into connected components, i.e. node clusters. I don't quite see where the edge clustering comes from. Througout it's 20-year-ish lifetime, MCL has always been compared with other node-clustering algorithms, never with edge-clustering algorithms. Happy to discuss further here. Thanks, Stijn (mcl author).

GiulioRossetti commented 4 years ago

Hi, first of all let me say that I completely agree with you. This "brutal" classification doesn't work particularly well in general.

The main issue here was due to the third party implementation included in CDlib that doesn't produce node clusters but edge ones. Indeed, it is trivial to convert among the two but, following a conservative strategy, we preferred to maintain the choices made in such implementation to avoid introducing approximation errors.

In some sense we can say that such "classification" is more related to the final result produced than to the strategy followed to reach such result.

Indeed, if you have an official python implementation to suggest us (or if you can check the correctness of the one we included) we'll be very happy to perform all the updates you feel necessary to let your algorithm be properly represented in CDlib.

Best, Giulio.

micans commented 4 years ago

Hello, thanks for responding so quickly! I don't know first-hand of any other Python implementation and currently lack the time to contribute myself. To be honest, I'd prefer a) if you make the trivial conversion from edge clustering to node clustering as you write. Failing that, I'd prefer b) mcl to be withdrawn from your list as it is very confusing to present it as an edge-clustering algorithm. To my mind, the harm in presenting it as such is bigger than the harm in making that trivial adjustment. Anyway, it is up to you of course if you go with a), b), or c) the status quo, but please consider the trade-off I mentioned; by being conservative and not implementing a trivial conversion, a confusing change has occurred that is disruptive rather than conservative. HNY & best wishes, Stijn

GiulioRossetti commented 4 years ago

@micans I'll make the conversion and update the list as soon as possible (hopefully within this week). Thank you for the feedback!

micans commented 4 years ago

Hello @GiulioRossetti that's great news, thank you very much!