一种是Multi-Instance Learning(MIL) divided the sentences into differ-ent bags by (h, t ), and tried to select well-labeled sentences from each bag (Zeng et al.,2015 ) or re-duced the weight of mislabeled data (Lin et al. , 2016 ).
Another way tended to capture the reg-ular pattern of the translation from true label to noise label, and learned the true distribution by modeling the noisy data (Riedel et al.,2010; Luo et al.,2017). Some novel methods like (Feng et al. , 2017 ) used reinforcement learning to train an instance-selector, which will choose true labeled sentences from the whole sentence set. These methods focus on adding an extra model to reduce the noisy label. However, stacking extra model does not fundamentally solve the problem of inad-equate supervision signals of distant supervision, and will introduce expensive training costs
Another solution is to exploit extra supervision signal contained in a KG. Weston (2013 ) added the confidence of (h, r, t ) in the KG as extra super-vision signal. Han (2018 ) used mutual attention of KG and text to calculate a weight distribution of train data. Both of them got a better perfor-mance by introducing more information from KG. However, they still used the hard relation label de-rived from distant supervision, which also brought
这篇文章主要是想避免 hard relation labels。 因为只要是hard labels,就不可避免引入一些noisy。所以想通过t-h的embedding来表示label. (但是这还是有问题啊,同样的t-h可能表达不同的realtionship,embedding学出来的效果还是会有noisy存在)
Our assumption is that each relation r in a KG has one or more sentence patterns that can describe the meaning of r .
We use typical KG embedding models such as TransE to pre-train the embedding of entities and
relations. We intend to supervise the learning by t - h instead of hard relation label r
3.2 Sentence Embedding
Word Embeddings and Attentions
Instead of encoding sentences directly, we first replace the entity mentions e in the sentences with corresponding entity types type e in the KG, such as PERSON, PLACE, ORGANIZATION, etc. We then pre-train the word embedding by word2vec.
一句话总结:
不使用hard label, 而是用KGE中的t-h来代替relation label。提高RE的效果
资源:
关键字:
笔记:
假设:noisy in DS这个问题主要是没有充分使用KG信息导致的。
办法:通过relation embedding (t- h) 以及entity type来代表label,而不是hard relation labels.
针对wrong label的问题,解决办法大致分为下面几种。
这篇文章主要是想避免 hard relation labels。 因为只要是hard labels,就不可避免引入一些noisy。所以想通过t-h的embedding来表示label. (但是这还是有问题啊,同样的t-h可能表达不同的realtionship,embedding学出来的效果还是会有noisy存在)
Our assumption is that each relation r in a KG has one or more sentence patterns that can describe the meaning of r .
左边的句子里Ankara和Turkey是captial的关系,而右边Mexcio和Guadalajara则是contains的关系。这二者如果直接去学的话,关系不一样(这里captial的关系属于noisy,我们想要的是contains的关系,而captial是contains的一个子关系)。但是如果通过t-h的话,这二者的关系,能更接近contain的关系,而不是capital的关系。
有两种embedding:
3.1 KG Embedding
We use typical KG embedding models such as TransE to pre-train the embedding of entities and relations. We intend to supervise the learning by t - h instead of hard relation label r
3.2 Sentence Embedding
Word Embeddings and Attentions
Instead of encoding sentences directly, we first replace the entity mentions e in the sentences with corresponding entity types type e in the KG, such as PERSON, PLACE, ORGANIZATION, etc. We then pre-train the word embedding by word2vec.
Position embedding
还是用的 #13 的方式。
模型图:
结果: