Shumpei-Kikuta / Papers

Summarizing papers I've already read
0 stars 0 forks source link

Inductive Representation Learning on Large Graphs(GraphSAGE) #12

Open Shumpei-Kikuta opened 5 years ago

Shumpei-Kikuta commented 5 years ago

General Information

Author: William L. Hamilton, Rex Ying, and Jure Leskovec Published: 2017

Why and What

Key word

Inductive learning is learning from the specific data and reasoning to general rules. Transductive learning is learning from the specific data and reasoning to specific rules.

The embedding method introduced so far is intended for learning the embedding matrix, which is easy to learn. However, the embedding can be difficult to be used in the evolving graph which is increasing the nodes and edges such as youtube and facebook as time goes by. This method, GraphSAGE, learns the function which governs the mapping of nodes into latent embedding spaces. The function learned in certain graphs can be used across graphs. Therefore, it is useful to capture the features of graphs and generalize the embedding.

This method utilizes the node features such as labels, profile information, and degrees in a supervised manner. The intuition is below. 2018-10-29 14 38 28

Related Work

This algorithm proposed by the author is related to previous nodes embedding approaches, supervised learning over graphs, and graph convolution. Take a look at in a simple way.

Factorization-based embedding approach

There are various kinds of methods making use of the random walk such as DeepWalk and node2vec. The problem existing in these methods is the lack of generalization. These methods are intended for transductive learning, which needs additional optimization when new nodes appear.

Supervised learning over graphs

This include graph kernel approaches which capture the features of the whole graphs and is used for the graph classification. The difference between GraphSAGE and graph kernels is that GraphSage is intended for the node embeddings.

Proposed method

Forward propagation

すでに最適化されたパラメータを持っていると仮定する. アルゴリズムは以下の通り. 2018-10-30 10 26 39 推論のステップは以下の二つに分けられる.

  1. 隣接ノードのAggregation
  2. Concatして,線形,非線型変換を行う. 直感的には,隣接ノードの特徴量をまとめ上げ,それを目的ノードの特徴量に加えていくことをKステップ繰り返すと,目的ノードからK-hop以内の場所にあるノードの特徴量が目的ノードの特徴量に加えられる. 隣接ノードに関して,計算コストと,線形変換時のshapeを揃えるため,一定の数の隣接ノードのみ考慮する.

Back propagation

コスト関数は以下のように与えられる. 2018-10-30 10 34 41 uは目的ノード,vは目的ノードからfixed length random walkをした際に共起するノード.考え方はskipgramのネガティブサンプリングと同じ.

Aggregator Architectures

Aggregation 操作として,mean, LSTM, Max-poolingがある.

Shumpei-Kikuta commented 5 years ago

関数を学べば,他のグラフに対しても適応できるため,非常に有用.