benedekrozemberczki / EdMot

An implementation of "EdMot: An Edge Enhancement Approach for Motif-aware Community Detection" (KDD 2019)
https://karateclub.readthedocs.io/
GNU General Public License v3.0
133 stars 19 forks source link

More discussion about Edmot #2

Closed Rico-Lee closed 4 years ago

Rico-Lee commented 4 years ago

Because the issue was closed.

I think his 2nd question discussed here is talk about the pre-partition procedure in this algorithm.

It's mentioned in the abstract:

Firstly, a motif-based hypergraph is constructed and the top K largest connected components in the hypergraph are partitioned into modules. Afterwards, the connectivity structure within each module is strengthened by constructing an edge set to derive a clique from each module. Based on the new edge set, the original connectivity structure of the input network is enhanced to generate a rewired network, whereby the motif-based higher-order structure is leveraged and the hypergraph fragmentation issue is well addressed...

And this is also reflected in the pseudo code: image

As step 3 said, we should apply a partition algorithm S(may be Louvain or others) to partition each connected component ๐œ‘๐‘™ โˆˆ ฮฆ๐พ into modules.

I think this is not to be omitted to EdMot, because the goal of this step is to detect the high order community structure, hence we can leverge both high-order and low-order connectivity pattern by enhancing the graph by high order cliques(derive from modules, not directly from the components).

The matlab version code also follows this step.

And I reasonably think we should add a function(maybe called extract_modules??) after extract components. So the blocks to be filled are these modules in fact.

    def fit(self):
        """
        Clustering the target graph.
        """
        self._calculate_motifs()
        self._extract_components()
        self._extract_modules()
        self._fill_blocks()
        partition = community.best_partition(self.graph)
        return partition

Your project is perfect, I love your code because it's efficient and beautiful.

So I just give a little advice on my understanding of the paper.

Hope it helps :๏ผ‰

Rico-Lee commented 4 years ago

And another bug in function __extract_components_

   def _extract_components(self):
        """
        Extracting connected components from motif graph.
        """
        print("\nExtracting components.\n")
        components = [c for c in nx.connected_components(self.motif_graph)]
        components = [[len(c), c] for c in components]
        components.sort(key=lambda x: x[0], reverse=True)
        important_components = [components[comp][1] for comp in range(self.component_count)]
        self.blocks = [list(graph) for graph in important_components]

when cutoff=50 and calculate the important_components, an error will occur

    important_components = [components[comp][1] for comp
IndexError: list index out of range

I guess this is because you didn't update this code as that in karateclub. :)

And the Project karateclub is a great project!

Thanks for your work, it helps me a lot. : )

benedekrozemberczki commented 4 years ago

Hi there,

The reason for not including the higher order blocks is computational -- clique enumeration can be painfully slow. This EdMot implementation only considers cliques of size 3. I will do the modification which you proposed.

I am happy that you find karate Club useful. We are writing a paper out of it and we will submit to ASONAM 2020, if you are willing to contribute at least 3 algorithms we are happy to add you on the paper.

Regards, Benedek

Rico-Lee commented 4 years ago

Thanks for your reply!

I recently started to work with graphs and your work really make it easy!

This python version of EdMot is easy to read and your reply makes me understand better.

I am curious about the performance if I further partition components into modules. Maybe I will have a try based on your work and discusa with you then.

Best wishes!

fgmn commented 1 year ago

It seems that problem hasn't been fixed. Actually, after some tests, I find filling the important components has no use at all. I try to pr self._extract_modules() of my version.