Open lijun-1999 opened 10 months ago
使用NYT数据集和WebNLG数据集复现代码是没有报错的,使用自己的数据,按照NYT数据集的格式修改的自己的数据,但是报这个错误
将list_index相关的代码修改为: def list_index(list1: list, list2: list) -> list: if not list1 or not list2: return -1, -1 # 返回无效的索引值
start = [i for i, x in enumerate(list2) if x == list1[0]]
end = [i for i, x in enumerate(list2) if x == list1[-1]]
index = (-1, -1) # 初始化索引变量
if len(start) == 1 and len(end) == 1:
return start[0], end[0]
else:
for i in start:
for j in end:
if i <= j:
if list2[i:j+1] == list1:
index = (i, j)
break
return index[0], index[1]
可以解决报错
但是修改过后,又有新的错误
DATA SUMMARY END.
=== Epoch 0 train ===
/pytorch/aten/src/ATen/native/cuda/Loss.cu:247: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [10,0,0] Assertion t >= 0 && t < n_classes
failed.
Traceback (most recent call last):
File "/root/miniconda3/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/root/miniconda3/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/root/autodl-tmp/SPN4RE/Nr_Partial_ch_SPN4RE-main/main.py", line 102, in
我遇到了相同的错误,错误的原因很可能是由于em1Text,em2Text中的文本和sentText中的不匹配,比如大小写不同。或者含有特殊字符也可能导致错误。
你可以在./utils/functions.py/data_process函数中的:
for i in range(len(lines)):
循环体中打印i的值,来寻找具体是文件中的哪些行有这些问题,修改或者直接删除这些行应该就能解决这个问题。
顺便分享几个我找到错误数据:
{"sentText": "D-penicillamine in the treatment of rheumatoid arthritis .", "relationMentions": [{"em1Text": "[D-penicillamine", "em2Text": "rheumatoid arthritis", "label": "/chemical/disease/other"}]}
这个应该是包含特殊字符的原因,导致错误的切割token。
{"sentText": "Various dependent variables measuring depression showed no significant relapse-preventing effects of fluvoxamine , but only positive trends", "relationMentions": [{"em1Text": "fluvoxamine", "em2Text": "depression", "label": "/chemical/disease/other"}, {"em1Text": "Fluvoxamine", "em2Text": "depression", "label": "/chemical/disease/other"}]}
这个是由于em1Text中的大小写问题。
{"sentText": "Strontium 87mSr bone scanning for the evaluation of total hip replacement.In a series of seventeen patients with unilateral osteoarthritis of the hip a scintiscanning follow-up study was made before and after total hip replacement for the assessment of the normal course of the 87mSr-scintiscan", "relationMentions": [{"em1Text": "mSr", "em2Text": "unilateral osteoarthritis", "label": "/gene/disease/related"}, {"em1Text": "hip", "em2Text": "unilateral osteoarthritis", "label": "/gene/disease/related"}, {"em1Text": "hip", "em2Text": "unilateral osteoarthritis", "label": "/gene/disease/related"}, {"em1Text": "hip", "em2Text": "unilateral osteoarthritis", "label": "/gene/disease/related"}]}
这个是由于关系重复。
请问这个错误应该怎么解决啊? Traceback (most recent call last): File "/root/miniconda3/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/root/miniconda3/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/root/autodl-tmp/SPN4RE/Nr_Partial_ch_SPN4RE-main/main.py", line 97, in
data = build_data(args)
File "/root/autodl-tmp/SPN4RE/Nr_Partial_ch_SPN4RE-main/utils/data.py", line 47, in build_data
data.generate_instance(args, data_process)
File "/root/autodl-tmp/SPN4RE/Nr_Partial_ch_SPN4RE-main/utils/data.py", line 31, in generate_instance
self.train_loader = data_process(args.train_file, self.relational_alphabet, tokenizer)
File "/root/autodl-tmp/SPN4RE/Nr_Partial_ch_SPN4RE-main/utils/functions.py", line 77, in data_process
tail_start_index, tail_end_index = list_index(tail_token, token_sent)
File "/root/autodl-tmp/SPN4RE/Nr_Partial_ch_SPN4RE-main/utils/functions.py", line 16, in list_index
return index[0], index[1]
UnboundLocalError: local variable 'index' referenced before assignment