dmlc / dgl

Python package built to ease deep learning on graph, on top of existing DL frameworks.
http://dgl.ai
Apache License 2.0
13.43k stars 3k forks source link

[RFC] Adjacency matrix multiplication and summation functions #2718

Closed BarclayII closed 3 years ago

BarclayII commented 3 years ago

🚀 Feature

This issue proposes two graph transformation functions, adj_product_graph and adj_sum_graph, corresponding to adjacency matrix multiplication and summation. Both of them are differentiable w.r.t. edge weights. The function names are tentative and more name options are welcome.

Motivation

These functions are necessary components for supporting Graph Transformer Networks (GTN).

Function proposal: adj_product_graph

Signature

def adj_product_graph(g_a: DGLGraph, g_b: DGLGraph, weight_column: str) -> DGLGraph:
    pass

Description (Docstring)

Construct a graph whose (weighted) adjacency matrix is the product of the two given (weighted) adjacency matrices.

In DGL, this function accepts two graphs G_A and G_B. Both G_A and G_B must have only one edge type (s_A, e_A, d_A) and (s_B, e_B, d_B), and must be simple graphs. G_A's destination node type d_A must be the same as G_B's source node type s_B. The number of nodes with type d_A in G_A must be the same as the number of nodes with type s_B in G_B.

If weight_column is specified, this function treats the edge features with name weight_column in G_A and G_B as the (scalar) edge weights.

This function returns a single graph G_C whose source node type is the same as G_A's source node type s_A, and whose destination node type is the same as G_B's destination node type d_B. As a consequence, G_C will be homogeneous if s_A is the same as d_B (in which case the number of nodes must match), or bipartite otherwise.

An edge exists between node i of type s_A and node j of type d_B in G_C iff there exists a k such that

If weight_column is specified, DGL determines the weight of the edge between node i of type s_A and node j of type d_B in G_C as follows:

C_{ij} = \sum_{k: (i,k) \in G_A, (k,j) \in G_B} A_{ik} B_{kj}

Mathematically, this is equivalent to multiplying two weighted adjacency matrices.

If weight_column is specified, G_C's edge weights will be differentiable w.r.t. the two input graphs' edge weights. Putting in equations, assuming that the forward function is

C = AB

The backward function is

\nabla_A \mathcal{L} = (\nabla_C \mathcal{L}) B^\top \odot \mathbf{1}_A \\
\nabla_B \mathcal{L} = A^\top (\nabla_C \mathcal{L}) \odot \mathbf{1}_B

where \mathbf{1}_A and \mathbf{1}_B are 0-1 mask matrices indicating whether an edge exists in G_A and G_B respectively (or the non-zero entries of A and B).

Function proposal: adj_sum_graph

Signature

def adj_sum_graph(gs: List[DGLGraph], weight_column: str) -> DGLGraph:
    pass

Description (Docstring)

Construct a graph whose (weighted) adjacency matrix is the sum of the given (weighted) adjacency matrices.

In DGL, this function accepts a list of graphs G_i that

If weight_column is specified, this function treats the edge features with name weight_column in all the graphs as the (scalar) edge weights.

This function returns a single graph G with the same metagraph as the input graphs.

An edge exists between two nodes in G iff an edge exists between the same nodes in either one of the input graphs.

If weight_column is specified, DGL determines the weight of the edge between node i and node j in G as follows:

A_{ij} = \sum_{k} A_{k,ij}

where A_{k,ij} is the edge weight between node i and j if the edge exists in G_k, or 0 otherwise.

Mathematically, this is equivalent to adding a list of weighted adjacency matrices.

If weight_column is specified, G's edge weights will be differentiable w.r.t. the two input graphs' edge weights. Mathematically, the backward function is:

\nabla_{A_k} \mathcal{L} = \nabla_A \mathcal{L} \odot \mathbf{1}_{A_k} \\
lygztq commented 3 years ago

There are also other operations that can benefit from these functions. For example, the khop_graph where two input graphs are the same, and the metapath_reachable_graph where two input graphs are two subgraphs (with different edge types n_1 -> e_1 -> n_2 and n_2 -> e_2 -> n_3) in a heterogeneous graph. Currently, these operations are implemented via sparse matrix operations in scipy, which limits the performance.

Generally speaking, I think there are three scenarios involving these functions: