Closed Lyn-bia closed 3 years ago
……data_processor.py", line 118, in read_data for line in f: UnicodeDecodeError: 'gbk' codec can't decode byte 0xab in position 16: illegal multibyte sequence
麻烦请教一下读入数据集应该采用什么编码格式?UTF-8和GBK都报错。
你可以用这个仓库处理的数据https://github.com/gitabtion/BertBasedCorrectionModels,后期我会把该仓库的数据处理脚本同步到本仓库。
您好,我也遇到了编码问题,请问有什么解决的办法吗?
……data_processor.py", line 118, in read_data for line in f: UnicodeDecodeError: 'gbk' codec can't decode byte 0xab in position 16: illegal multibyte sequence
麻烦请教一下读入数据集应该采用什么编码格式?UTF-8和GBK都报错。