jboynyc / textnets

Text analysis with networks.
https://textnets.readthedocs.io/
GNU General Public License v3.0
284 stars 23 forks source link

Alternative Community Algorithm Usage #31

Closed BradKML closed 2 years ago

BradKML commented 2 years ago

Observing that this repo only uses Leidan, it would not be hard to propose the use of other Community Detection algorithms. One library comes to mind https://cdlib.readthedocs.io/ And for Bipartite communities, there are other libraries that are being integrated https://github.com/GiulioRossetti/cdlib/issues/178#issuecomment-937997820

jboynyc commented 2 years ago

Thanks, I wasn't aware of cdlib yet. I will document how to use other community detection algorithms with textnets in a new section of advanced topics I'm adding to the documentation.

I'm not sure what it would mean substantively to use something like Infomap, which makes some different assumptions about the underlying graph. Do you have any thoughts about that?

BradKML commented 2 years ago

@jboynyc a major idea of the other algorithms in CDLibs are mainly there for speed-up concerns, but they are all based on some core ideas (e.g. random walks vs modularity vs node similarity). With Infomap and random walks, if two people have similar information flow then they should be in the same community, even if they are not necessarily modularity or partition sensitive (AKA being clustered together). Label Propagation is another class of algorithm that, instead of using walk length, uses network depth to spread labels around, similar to a zombie movie.

There are other algorithms that are specific to bipartite graphs (similar to topic model usage based on term and author), and directed graphs (similar to citation/reference/inspiration). Those are also worth testing and supporting for experimental purposes.

A-classification-of-community-detection-and-graph-clustering-methods-according-to-the

jboynyc commented 2 years ago

The advanced topics section of the documentation now has example code for cdlib and karateclub. Maybe I will add something for scikit-networks after I having a chance to try it out.

Thanks for bringing these projects to my attention!