THUNLP-MT / THUMT

An open-source neural machine translation toolkit developed by Tsinghua Natural Language Processing Group
BSD 3-Clause "New" or "Revised" License
701 stars 197 forks source link

你好,请问训练出现KeyError: '<unk>'是怎么回事 #88

Closed edwardelric1202 closed 4 years ago

edwardelric1202 commented 4 years ago

数据集是wmt17,按照使用教程预处理好的,在训练一开始的时候就出现了这个问题: Traceback (most recent call last): File "/home/cma/cy.he/self_code/fairseq-master/THUMT/thumt/bin/trainer.py", line 512, in <module> main(parse_args()) File "/home/cma/cy.he/self_code/fairseq-master/THUMT/thumt/bin/trainer.py", line 359, in main features = dataset.get_training_input(params.input, params) File "/home/cma/cy.he/self_code/fairseq-master/THUMT/thumt/data/dataset.py", line 141, in get_training_input default_value=params.mapping["source"][params.unk] KeyError: '<unk>' 请问如何解决?是数据集的问题吗?

JasmineChen123 commented 2 years ago

How to solve this problem?