NUSTM / ACOS

The datasets and code of ACL 2021 paper "Aspect-Category-Opinion-Sentiment Quadruple Extraction with Implicit Aspects and Opinions".
194 stars 29 forks source link

关于step2编码问题 #9

Closed betterwater closed 2 years ago

betterwater commented 2 years ago

最近拜读了论文,尝试运行时,step2一直报utf-8编码问题,尝试了网上大多数修改方法,仍没有解决,请问有办法破吗(悲)

blhoy commented 2 years ago

可以看一下报错的信息是什么,可能是数据编码格式变了?

betterwater commented 2 years ago

可以看一下报错的信息是什么,可能是数据编码格式变了?

这是运行时候报的错。 Traceback (most recent call last): File "F:/acos/ACOS/Extract-Classify-ACOS/run_step2.py", line 351, in main() File "F:/acos/ACOS/Extract-Classify-ACOS/run_step2.py", line 174, in main eval_examples = processor.get_dev_examples(args.data_dir, args.domain_type) File "F:\acos\ACOS\Extract-Classify-ACOS\run_classifier_dataset_utils.py", line 208, in get_dev_examples self._read_tsv(os.path.join(data_dir, "tokenized_data/"+string+"_test_pair_1st.tsv")), "test") File "F:\acos\ACOS\Extract-Classify-ACOS\run_classifier_dataset_utils.py", line 127, in _read_tsv for line in reader: File "F:\anaconda\envs\ACOS\lib\codecs.py", line 322, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa8 in position 2954: invalid start byte

blhoy commented 2 years ago

这块我测试了没有这个问题,应该就是遇到解码不了的字符了,或许可以试着按不同编码另存一下输入数据文件?

betterwater commented 2 years ago

好的,我试试,麻烦你了