131250208 / TPlinker-joint-extraction

438 stars 94 forks source link

中文数据集上match pattern应该用哪个? #54

Open macheng6 opened 3 years ago

131250208 commented 3 years ago

根据自己的需要,ground truth有span的就用whole span,没有的用whole text

macheng6 commented 3 years ago

根据自己的需要,ground truth有span的就用whole span,没有的用whole text

好的,谢谢。

lzh1998-jansen commented 1 year ago

根据自己的需要,ground truth有span的就用whole span,没有的用whole text

好的,谢谢。

如下的训练数据格式, 应该选择用 “whole_span” 吗? -----{"id": "train_0", "text": "《邪少兵王》是冰火未央写的网络小说连载于旗峰天下", "relation_list": [{"subject": "邪少兵王", "object": "冰火未央", "subj_char_span": [1, 5], "obj_char_span": [7, 11], "predicate": "作者", "subj_tok_span": [1, 5], "obj_tok_span": [7, 11]}], "entity_list": [{"text": "邪少兵王", "type": "图书作品", "char_span": [1, 5], "tok_span": [1, 5]}, {"text": "冰火未央", "type": "人物", "char_span": [7, 11], "tok_span": [7, 11]}]}