crownpku / Information-Extraction-Chinese

Chinese Named Entity Recognition with IDCNN/biLSTM+CRF, and Relation Extraction with biGRU+2ATT 中文实体识别与关系提取
2.22k stars 813 forks source link

IDCNN ner 在eval时报错UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa3 in position 0: invalid start byte #110

Open bigcat2333 opened 5 years ago

bigcat2333 commented 5 years ago

报错如下所示: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa3 in position 0: invalid start byte 请问该如何解决呢?

sissilaux commented 5 years ago

将文本的编码方式也设置为‘utf-8’

zhangjingyizhc commented 4 years ago

我是将conlleval.py的with codecs.open(input_file, "r", "utf8") as f:语句中的utf8改为unicode_escape,即可正确运行。