Translating Embeddings for Modeling Multi-relational Data

总结

embedding entities and relationships of multi-relational data in low-dimensional vector spaces.

论文链接/代码

作者/机构

发表时间(yyyy/MM/dd)

概要

Multi-relational data 指的是directed graphs，其中地node与entities和edges相关。

form: (head, label, tail) (h, l, t)
label表示relaionship的name

应用场景

社交网络分析：entities: member, edges: friendship/social network
推荐：entites: users, products, edges: buying, rating, reviewing, searching for a product
knowledge base（KB）: entity: an abstract concept or concree entity of the world, edges: predicate the represent facts involving two of them

本文的工作是从KB(wordnet freebase)中建模，目的是自动添加new fact，即自动添加各种关系。

Modeling multi-relational data

对于single-relaitonal data，用一些描述性分析也能做很多预测，而relational data的难点在于locality(局部)可能会涉及多个关系，多个实体，而且种类会不一样。我们需要一个更普通的方法来考虑各种模式，对multi-relaional data进行建模，来同时捕捉所有的heterogeneous relationships（异质关系）。

Relationships as translations in the embedding space

relationships are represented as translations in the embedding space: (h, l, t)

embedding(h) + vector(l) = embedding (t)
TransE模型会给每一个entity和每一个relationship学习一个低纬度向量

这个模型的契机有两点。一是hierarchical relationship在了KB中很常见。比如对于一个tree结构的node进行表示，其emebdding应该接近于它的相邻node。第二点是word2vec模型的出现。

数据：Freebase containing 1M entities, 25k relationships and more than 17M training samples.

创新点

将三元组embedding

手法

Wordnet synsets 同义词集. We considered the data version used in [2], which wedenote WN in the following. Examples of triplets are (scoreNN1,hypernym,evaluationNN1)or (scoreNN2,haspart,musicalnotationNN1

WN is composed of senses, its entities are denoted by the concatenation of a word, its part-of-speech tagand a digit indicating which sense it refers to i.e.scoreNN1encodes the first meaning of the noun “score”

结果

relationship根据head和tail分为四种类：1-TO-1, 1-TO-MANY, MANY-TO-1, MANY-TO-MANY.

A given relationship is 1-TO-1 if ahead can appear with at most one tail
1-TO-MANYif ahead can appear with many tails
MANY-TO-1 if many heads can appear with the same tail
MANY-TO-MANYif multiple heads can appear with multiple tails.

We obtained that FB15k has 26.2% of 1-TO-1relationships, 22.7% of 1-TO-MANY, 28.3% of MANY-TO-1, and 22.8% of MANY-TO-MANY.

transE参数设置问题 #31 code

BrambleXu / knowledge-graph-learning