pyg-team / pytorch_geometric

Graph Neural Network Library for PyTorch
https://pyg.org
MIT License
21.39k stars 3.67k forks source link

Bugfix for the hyperconv layer #1973

Closed THinnerichs closed 2 years ago

THinnerichs commented 3 years ago

Hey Matthias, Happy new year to you!

I just copied my answer to issue #1801.

Correct me if I misunderstood something: The hypergaph convolution works by introducing a vertex for each hyperedge, then propagating the signal from each vertex to those and then propagating and distributing that signal from the help vertices among all nodes that are connected by each hyperedge. This is actually a quite neat solution.

However, you never introduce fresh nodes for each hyperedge, but rather reuse the existing ones, which should be mathematically correct, but leads to the issue described above. Thus, I think a simple zero padding for the node_features \in \mathbb{R}^{num_nodes \times num_features} to \mathbb{R}^{max{num_nodes, num_hyperedges} \times num_features} in line 126 of HypergraphConv, i.e. from

self.flow = 'source_to_target'
out = self.propagate(hyperedge_index, x=x, norm=B, alpha=alpha)
self.flow = 'target_to_source'
out = self.propagate(hyperedge_index, x=out, norm=D, alpha=alpha)

to

x_help = torch.zeros((max([x.size(0), num_edges]), x.size(1))) # create help matrix of shape defined above
x_help[:x.size(0), :x.size(1)] = x  #write node_features to that help matrix

self.flow = 'source_to_target'
out = self.propagate(hyperedge_index, x=x_help, norm=B, alpha=alpha)
self.flow = 'target_to_source'
out = self.propagate(hyperedge_index, x=out, norm=D, alpha=alpha)

(with that rather hacky implementation for simple padding)

should seal the deal. Or introduce num_hyperedges new vertices every time and alter edge_index by edge_index[1,:] += num_nodes.

If I misunderstood anything, please feel free to correct me! :)

rusty1s commented 3 years ago

Hi, thanks for your time and effort! Can you send a PR so I can have a detailed look?

THinnerichs commented 3 years ago

Sorry for the delayed response. I submitted the PR. This works for plain hypergraph convolution, but I am not sure about multiple attention heads. As this only affects the feature dimensions, but not the number of nodes i.e. dim=1 of x, this should be fine. :)