BrambleXu / knowledge-graph-learning

A curated list of awesome knowledge graph tutorials, projects and communities.
MIT License
746 stars 120 forks source link

EMNLP-2018-Label-Free Distant Supervision for Relation Extraction via Knowledge Graph Embedding #44

Open BrambleXu opened 5 years ago

BrambleXu commented 5 years ago

一句话总结:

不使用hard label, 而是用KGE中的t-h来代替relation label。提高RE的效果

资源:

关键字:

笔记:

假设:noisy in DS这个问题主要是没有充分使用KG信息导致的。

办法:通过relation embedding (t- h) 以及entity type来代表label,而不是hard relation labels.

针对wrong label的问题,解决办法大致分为下面几种。

这篇文章主要是想避免 hard relation labels。 因为只要是hard labels,就不可避免引入一些noisy。所以想通过t-h的embedding来表示label. (但是这还是有问题啊,同样的t-h可能表达不同的realtionship,embedding学出来的效果还是会有noisy存在)

Our assumption is that each relation r in a KG has one or more sentence patterns that can describe the meaning of r .

image-20190320094146306

左边的句子里Ankara和Turkey是captial的关系,而右边Mexcio和Guadalajara则是contains的关系。这二者如果直接去学的话,关系不一样(这里captial的关系属于noisy,我们想要的是contains的关系,而captial是contains的一个子关系)。但是如果通过t-h的话,这二者的关系,能更接近contain的关系,而不是capital的关系。

有两种embedding:

3.1 KG Embedding

We use typical KG embedding models such as TransE to pre-train the embedding of entities and relations. We intend to supervise the learning by t - h instead of hard relation label r

3.2 Sentence Embedding

Word Embeddings and Attentions

Instead of encoding sentences directly, we first replace the entity mentions e in the sentences with corresponding entity types type e in the KG, such as PERSON, PLACE, ORGANIZATION, etc. We then pre-train the word embedding by word2vec.

Position embedding

还是用的 #13 的方式。

模型图:

image

image

结果

image

image

Rebaccamin commented 4 years ago

这篇我找不到代码 就想知道用type代表entity怎么选择的,NYT的那个数据,1个entity有还几个entity type。是用最common的词还是用了新的type recognizer 标了type?非常迷茫。。。