Open nlper01 opened 1 year ago
感觉是数据中有长度为0的数据咩?
感觉是数据中有长度为0的数据咩?
我也怀疑过这个问题,我检查了数据,没有发现长度为0的数据
感觉是数据中有长度为0的数据咩? 这是我的原始数据:
[ { "doc_id": "1", "paragraphs": [ { "paragraph_id": "0", "sentences": [ { "sentence_id": "0", "sentence": "中国成人2型糖尿病胰岛素促泌剂应用的专家共识", "start_idx": 0, "end_idx": 22, "entities": [ { "entity_id": "T0", "entity": "2型糖尿病", "entity_type": "Disease", "start_idx": 4, "end_idx": 9 }, { "entity_id": "T1", "entity": "2型", "entity_type": "Class", "start_idx": 4, "end_idx": 6 }, { "entity_id": "T2", "entity": "胰岛素促泌剂", "entity_type": "Drug", "start_idx": 9, "end_idx": 15 } ], "relations": [ { "relation_type": "Drug_Disease", "relation_id": "R0", "head_entity_id": "T2", "tail_entity_id": "T0" }, { "relation_type": "Class_Disease", "relation_id": "R1", "head_entity_id": "T1", "tail_entity_id": "T0" } ] } ] } ]
没发现句子长度为0的
转换成模型需要的格式如下:
{"doc_id": "1", "paragraph_id": "0", "tokens": ["中", "国", "成", "人", "2", "型", "糖", "尿", "病", "胰", "岛", "素", "促", "泌", "剂", "应", "用", "的", "专", "家", "共", "识"], "entity_mentions": [{"entity_type": "Disease", "start": 4, "end": 9, "text": "2型糖尿病"}, {"entity_type": "Class", "start": 4, "end": 6, "text": "2型"}, {"entity_type": "Drug", "start": 9, "end": 15, "text": "胰岛素促泌剂"}]}
感觉是数据中有长度为0的数据咩?
大佬,我把数据给你,能否帮忙看看?卡住了。出不来就没法引用你们的工作了
你出问题的数据不是这一条数据吧?
在我的数据中,同样是出现了这个问题,但是是在发生在超出max_len的情况下,这个代码的截断部分有点问题
此处max_len 出现为0的情况