benedekrozemberczki / karateclub

Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)
https://karateclub.readthedocs.io
GNU General Public License v3.0
2.16k stars 246 forks source link

Having some difficulty with embedding graphs and nodes #49

Closed ChristopherMarais closed 4 years ago

ChristopherMarais commented 4 years ago

Good day,

I have been trying to create embeddings for nodes in a graph and for whole graphs using the FeatherNode and FeatherGraph functions.

What am I doing wrong? is it because not all the nodes in the graphs are connected? How can I get around this?

Regards Chris

AssertionError Traceback (most recent call last)

in 1 model = FeatherNode() ----> 2 model.fit(G, expression.values) 3 emb = model.get_embedding() ~\anaconda3\envs\MIT_807_P38\lib\site-packages\karateclub\node_embedding\attributed\feathernode.py in fit(self, graph, X) 109 """ 110 self._set_seed() --> 111 self._check_graph(graph) 112 X = self._create_reduced_features(X) 113 A_tilde = self._create_A_tilde(graph) ~\anaconda3\envs\MIT_807_P38\lib\site-packages\karateclub\estimator.py in _check_graph(self, graph) 60 def _check_graph(self, graph: nx.classes.graph.Graph): 61 """Check the Karate Club assumptions about the graph.""" ---> 62 self._check_connectivity(graph) 63 self._check_directedness(graph) 64 self._check_indexing(graph) ~\anaconda3\envs\MIT_807_P38\lib\site-packages\karateclub\estimator.py in _check_connectivity(self, graph) 42 """Checking the connected nature of a single graph.""" 43 connected = nx.is_connected(graph) ---> 44 assert connected, "Graph is not connected." 45 46 AssertionError: Graph is not connected.
ChristopherMarais commented 4 years ago

I would like to perform link prediction and include the orphaned nodes for this. How do I approach this problem? I have a single adjacency matrix with multiple subgraphs that may not all be connected to one another.

benedekrozemberczki commented 4 years ago
  1. Regarding the orphaned nodes - I do not think that any embedding method is particularly meaningful in that case. I have to think about that.
  2. Use separate graphs with indexing starting at zero, as it is a completely inductive method every node will be embedded in the same space. Use Feather N iteratively.

On Thu, 10 Sep 2020 at 10:36, Chris notifications@github.com wrote:

I would like to perform link prediction and include the orphaned nodes for this. How do I approach this problem? I have a single adjacency matrix with multiple subgraphs that may not all be connected to one another.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/benedekrozemberczki/karateclub/issues/49#issuecomment-690144174, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEETMF5QZGHL7KL6MHSZ4LLSFCT2RANCNFSM4REXZRVQ .

ChristopherMarais commented 4 years ago

I managed to remove the orphaned nodes and get the embedding to work on one of my graphs.

Now I get the "The node indexing is wrong" error. which is strange because I was able to get around that with my previous network by renaming the adjacency matrix columns and indexes in the pandas dataframe before creating the networkx graph.

When doing this again on a new network I get the following error:


AssertionError Traceback (most recent call last)

in 1 model = FeatherNode() ----> 2 model.fit(G, expression_filt.values) 3 emb = model.get_embedding() ~\anaconda3\envs\MIT_807_P38\lib\site-packages\karateclub\node_embedding\attributed\feathernode.py in fit(self, graph, X) 109 """ 110 self._set_seed() --> 111 self._check_graph(graph) 112 X = self._create_reduced_features(X) 113 A_tilde = self._create_A_tilde(graph) ~\anaconda3\envs\MIT_807_P38\lib\site-packages\karateclub\estimator.py in _check_graph(self, graph) 62 self._check_connectivity(graph) 63 self._check_directedness(graph) ---> 64 self._check_indexing(graph) 65 66 ~\anaconda3\envs\MIT_807_P38\lib\site-packages\karateclub\estimator.py in _check_indexing(self, graph) 55 numeric_indices = [index for index in range(graph.number_of_nodes())] 56 node_indices = sorted([node for node in graph.nodes()]) ---> 57 assert numeric_indices == node_indices, "The node indexing is wrong." 58 59 AssertionError: The node indexing is wrong.
ChristopherMarais commented 4 years ago

Nevermind, I figured it out. I forgot to remove the orphan nodes from my feature matrix.

sorry for bothering you here, but I really want to say how much I appreciate your quick response and I am fully enjoying using karateclub