Closed BarclayII closed 3 years ago
There are also other operations that can benefit from these functions. For example, the khop_graph where two input graphs are the same, and the metapath_reachable_graph where two input graphs are two subgraphs (with different edge types n_1 -> e_1 -> n_2
and n_2 -> e_2 -> n_3
) in a heterogeneous graph. Currently, these operations are implemented via sparse matrix operations in scipy
, which limits the performance.
Generally speaking, I think there are three scenarios involving these functions:
A^k
. Although directly using A^k
in the computation of graph neural networks may be inefficient since k-hop graphs (especially with self-loop) are usually very dense (actually, a common practice is to compute A^k X
with stacked aggregation operations instead of A^k
, an example is the Simplifying Graph Convolutional Networks), this operation may be useful in some other scenarios. Also, dgl has already had such an operation and these functions can make it more efficient.A + A^2 + \dots + A^k
or Bool(A + A^2 + \dots + A^k)
. In such a scenario, we want to exploit the locality of the original graph. For example, we want to sample nodes from all nodes whose distance from the central node is less than or equal to k
. We can find such a process in the paper Hierarchical Graph Pooling with Structure Learning.A
with type a
to node C
with type c
where the "middle node" is node B
with type b
. Here I describe an operation that can benefit from these functions: the cluster_pooling
operation. The cluster_pooling
operation is described in #2599.
We can implement this operation using a heterogeneous graph with two node types (node and cluster) and three edge types (node to node, node to cluster, cluster to node). To pooling node features, we can simply apply message passing and reduce operations from nodes to clusters. To pooling edges, we can compute a metapath_reachable_graph
with three edge types: cluster to node
, node to node
, and node to cluster
. The edges within a cluster will be treated as a self-loop edge and we can apply remove_self_loops
to remove them. The remaining edges are inter-cluster edges that we want to preserve.
🚀 Feature
This issue proposes two graph transformation functions,
adj_product_graph
andadj_sum_graph
, corresponding to adjacency matrix multiplication and summation. Both of them are differentiable w.r.t. edge weights. The function names are tentative and more name options are welcome.Motivation
These functions are necessary components for supporting Graph Transformer Networks (GTN).
Function proposal:
adj_product_graph
Signature
Description (Docstring)
Construct a graph whose (weighted) adjacency matrix is the product of the two given (weighted) adjacency matrices.
In DGL, this function accepts two graphs
G_A
andG_B
. BothG_A
andG_B
must have only one edge type(s_A, e_A, d_A)
and(s_B, e_B, d_B)
, and must be simple graphs.G_A
's destination node typed_A
must be the same asG_B
's source node types_B
. The number of nodes with typed_A
inG_A
must be the same as the number of nodes with types_B
inG_B
.If
weight_column
is specified, this function treats the edge features with nameweight_column
inG_A
andG_B
as the (scalar) edge weights.This function returns a single graph
G_C
whose source node type is the same asG_A
's source node types_A
, and whose destination node type is the same asG_B
's destination node typed_B
. As a consequence,G_C
will be homogeneous ifs_A
is the same asd_B
(in which case the number of nodes must match), or bipartite otherwise.An edge exists between node
i
of types_A
and nodej
of typed_B
inG_C
iff there exists ak
such thati
of types_A
and nodek
of typed_A
exists inG_A
, andk
of types_B
and nodej
of typed_B
exists inG_B
.If
weight_column
is specified, DGL determines the weight of the edge between nodei
of types_A
and nodej
of typed_B
inG_C
as follows:Mathematically, this is equivalent to multiplying two weighted adjacency matrices.
If
weight_column
is specified,G_C
's edge weights will be differentiable w.r.t. the two input graphs' edge weights. Putting in equations, assuming that the forward function isThe backward function is
where
\mathbf{1}_A
and\mathbf{1}_B
are 0-1 mask matrices indicating whether an edge exists inG_A
andG_B
respectively (or the non-zero entries ofA
andB
).Function proposal:
adj_sum_graph
Signature
Description (Docstring)
Construct a graph whose (weighted) adjacency matrix is the sum of the given (weighted) adjacency matrices.
In DGL, this function accepts a list of graphs
G_i
thatIf
weight_column
is specified, this function treats the edge features with nameweight_column
in all the graphs as the (scalar) edge weights.This function returns a single graph
G
with the same metagraph as the input graphs.An edge exists between two nodes in
G
iff an edge exists between the same nodes in either one of the input graphs.If
weight_column
is specified, DGL determines the weight of the edge between nodei
and nodej
inG
as follows:where
A_{k,ij}
is the edge weight between nodei
andj
if the edge exists inG_k
, or 0 otherwise.Mathematically, this is equivalent to adding a list of weighted adjacency matrices.
If
weight_column
is specified,G
's edge weights will be differentiable w.r.t. the two input graphs' edge weights. Mathematically, the backward function is: