Open Adrian-Yan16 opened 4 years ago
请问具体报什么错误啊?
要预测的数据是标注的数据,而不是生的数据
发送自 Windows 10 版邮件https://go.microsoft.com/fwlink/?LinkId=550986应用
发件人: Yiqun Liumailto:notifications@github.com 发送时间: 2020年10月9日 14:39 收件人: PaddlePaddle/modelsmailto:models@noreply.github.com 抄送: Adrian-Yan16mailto:Adrian-Yan2329@outlook.com; Authormailto:author@noreply.github.com 主题: Re: [PaddlePaddle/models] run_ernie.sh infer 进行预测,数据格式问题 (#4894)
请问具体报什么错误啊?
― You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/PaddlePaddle/models/issues/4894#issuecomment-706000708, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AM52KNXNW4UOHGACL3QJTSLSJ2V3XANCNFSM4R6R5FMA.
我也遇到了同样的问题,将test_data文件换为infer.tsv,会报如下错误:
Traceback (most recent call last): File "run_ernie_sequence_labeling.py", line 316, in <module> do_infer(args) File "run_ernie_sequence_labeling.py", line 264, in do_infer mode='test') File "/home/lihui/github_clone/models/PaddleNLP/lexical_analysis/creator.py", line 153, in create_pyreader phase=mode), File "../shared_modules/preprocess/ernie/task_reader.py", line 222, in data_generator examples = self._read_tsv(input_file) File "../shared_modules/preprocess/ernie/task_reader.py", line 85, in _read_tsv Example = namedtuple('Example', headers) File "/home/lihui/anaconda3/envs/paddle/lib/python3.6/collections/__init__.py", line 401, in namedtuple 'identifiers: %r' % name) ValueError: Type names and field names must be valid identifiers: "['百余名诺贝尔奖得主联合签名支持转基因作物,中国两院士签名,,,宁夏在线']"
1.8版本models ./models-release-1.8/PaddleNLP/shared_modules/preprocess/ernie/task_reader.py 这个文件 第222行,未区分mode状态,导致度数据的时候都是按照train的格式来读数据的,导致无法读取infer的数据,这个bug麻烦改下?
训练了模型之后,
bash run_ernie.sh infer
进行预测,预测的数据格式不应该是未标注的数据吗?你这里为什么只能用标注过的数据做预测用未标注过的语料训练就报错