设计三种类型的矩阵:EH to ET(Entity Head to Entity Tail)识别实体,例如("New York"和"De Blasio"),SH to OH(Subject Head to Object Head)分别识别两个关系实体的头部边界,ST to OT(Subject Tail to Object Tail)分别识别两个关系实体的尾部边界。
在EH to ET中同一个实体的尾部总在头部之后,复杂度为,但是两个实体间的关系有正有反。作者对于每种矩阵设计两种关系(1,2)表示正反。这样总共有个矩阵,空间换时间。
Prior works show that joint learning can result in a noticeable performance gain. However, they usually involve sequential interrelated steps and suffer from the problem of exposure bias. This discrepancy leads to error accumulation. To mitigate this issue, we propose in this paper a one-stage joint extraction model, namely, TPLinker, which is capable of discovering overlapping relations sharing one or both entities while immune from the exposure bias.
To validate the utility of our handshaking tagging scheme, we ablate the BERT and use BiLSTM as the substituted encoder to output results. It can be seen that TPLinker is still very competitive to existing state-of-the-art models.
TPLinker: Single-stage Joint Extraction of Entities and Relations Through Token Pair Linking
Motivation:先前的工作由于实体识别和关系判断不是one-stage,而多是级连的框架,在测试阶段存在exposure bias/error accumulation/distribution shift。本工作在one-stage的框架下解决了以上问题,同时还能解决关系嵌套问题(nested entities)如SEO(共用subject/object)和EPO(同一个pair不同关系类型)。
任务:Relation Extraction (NYT/WebNLG)
主要信息
新知识
\mathcal{D}=\{"New":[["New", "City"], ["New", "York"]], "De":["De", "Blasio"]\}
\mathcal{E}=\{"Blasio":[["Blasio", "York"], ["Blasio", "City"]]\}
\mathcal{D}
中的词表加入备选集合:Set_{subject}=Set([["New", "City"], ["New", "York"]], [["De", "Blassio"]])
Set_{object}=Set([["De", "Blasio"]],[["New", "York"], ["New", "City"]])
\mathcal{E}
,查找词尾。若存在则将加入结果集合,例如live in关系:\mathcal{T}=Set(["De\ Blasio", "New York", "live in"])
继续学习的相关工作
EPO problem:
AAAI 2020: Effective modeling of encoder-decoder architecture for joint entity and relation extraction. (MT)
ACL 2020: A novel cascade binary tagging framework for relational triple extraction. (Cascade)
实验验证任务
✏️好句好段