TNNLS '21 | A Survey on Knowledge Graphs: Representation, Acquisition and Applications

jasperzhong commented 1 year ago

https://arxiv.org/pdf/2002.00388.pdf

jasperzhong commented 1 year ago

KG是a sequence of triples (h, r, t).

Following previous literature, we define a knowledge graph as G = {E, R, F}, where E, R and F are sets of entities, relations and facts, respectively. A fact is denoted as a triple (h, r, t) ∈ F.

主要做的任务是knowledge graph completion，会有一个scoring function， f(h, r, t) -> scaler，用来给一个plausible fact打分，打分高的plausible facts就可以作为补全的facts. 一般做法是将entities和relation映射到一个representation space，就是有一个node/edge embedding.

这里面两个问题嘛. 一个是scoring function如何定义，第二个是如何做encoding，即生成embedding.

scoring funciont基本思路是想让v_h+ v_r尽可能接近v_t (v代表embedding). 最经典的TransE就是这样的 f(h, r, t) = ||h + r − t||就是这样的.

encoding方法，最早TransE直接用linear model，后面有MLP, CNN，RNN（会用到random walk）和GNN. GNN里面，比如R-GCN #339 .

然后temporal knowledge graph也是一个很重要的子领域，毕竟knowledge graph也是不断evolving的. fact的表示就是四元组，(h, r, t, \tau)（非常细粒度，CTDG）.

比如TTransE直接就是 f(h, r, t) = - ||h + r + \tau − t||.

这么一看，temporal KG方面可能真有不少CTDG的工作. 可以去survey一下.

然后scoring function其实都无关紧要，我们只focus on用HGNN生成node/edge embedding. edge embedding这个确实很少HGNN工作考虑. 浏览下来，HGNN的metapath什么的在KG里面完全没有提，感觉是两伙人各干各的...

jasperzhong commented 1 year ago

确实，我懂了，metapath其实很难用于KG，因为relation太多了...所以KG才更适用于relation-based sampling，比如R-GCN #339 .

jasperzhong / read-papers-and-code

TNNLS '21 | A Survey on Knowledge Graphs: Representation, Acquisition and Applications #348