yhcc / CNN_Nested_NER

103 stars 8 forks source link

这个报错怎么解决?为什么max_len会出现为0 的情况 #10

Open nlper01 opened 1 year ago

nlper01 commented 1 year ago
 bsz, max_len, dim = h.size()
 h = h.reshape(bsz, max_len, self.n_head, -1)

此处max_len 出现为0的情况

image

yhcc commented 1 year ago

感觉是数据中有长度为0的数据咩?

nlper01 commented 1 year ago

感觉是数据中有长度为0的数据咩?

我也怀疑过这个问题,我检查了数据,没有发现长度为0的数据

nlper01 commented 1 year ago

感觉是数据中有长度为0的数据咩? 这是我的原始数据:

[
{
"doc_id": "1",
"paragraphs": [
{
"paragraph_id": "0",
"sentences": [
{
"sentence_id": "0",
"sentence": "中国成人2型糖尿病胰岛素促泌剂应用的专家共识",
"start_idx": 0,
"end_idx": 22,
"entities": [
{
"entity_id": "T0",
"entity": "2型糖尿病",
"entity_type": "Disease",
"start_idx": 4,
"end_idx": 9
},
{
"entity_id": "T1",
"entity": "2型",
"entity_type": "Class",
"start_idx": 4,
"end_idx": 6
},
{
"entity_id": "T2",
"entity": "胰岛素促泌剂",
"entity_type": "Drug",
"start_idx": 9,
"end_idx": 15
}
],
"relations": [
{
"relation_type": "Drug_Disease",
"relation_id": "R0",
"head_entity_id": "T2",
"tail_entity_id": "T0"
},
{
"relation_type": "Class_Disease",
"relation_id": "R1",
"head_entity_id": "T1",
"tail_entity_id": "T0"
}
]
}
]
}
]

image 没发现句子长度为0的

转换成模型需要的格式如下: {"doc_id": "1", "paragraph_id": "0", "tokens": ["中", "国", "成", "人", "2", "型", "糖", "尿", "病", "胰", "岛", "素", "促", "泌", "剂", "应", "用", "的", "专", "家", "共", "识"], "entity_mentions": [{"entity_type": "Disease", "start": 4, "end": 9, "text": "2型糖尿病"}, {"entity_type": "Class", "start": 4, "end": 6, "text": "2型"}, {"entity_type": "Drug", "start": 9, "end": 15, "text": "胰岛素促泌剂"}]}

nlper01 commented 1 year ago

感觉是数据中有长度为0的数据咩?

大佬,我把数据给你,能否帮忙看看?卡住了。出不来就没法引用你们的工作了

qq594495953 commented 1 year ago

你出问题的数据不是这一条数据吧?

qq594495953 commented 1 year ago

在我的数据中,同样是出现了这个问题,但是是在发生在超出max_len的情况下,这个代码的截断部分有点问题