(tf2) [root@nlp SoftMaskedBert-PyTorch-main]# python main.py --mode preproc
preprocessing...
Traceback (most recent call last):
File "main.py", line 99, in <module>
main()
File "main.py", line 63, in main
preproc()
File "/root/sammy/ForceWord/SoftMaskedBert-PyTorch-main/src/data_processor.py", line 187, in preproc
for item in read_data(get_abs_path('data')):
File "/root/sammy/ForceWord/SoftMaskedBert-PyTorch-main/src/data_processor.py", line 117, in read_data
for line in f:
File "/root/anaconda3/envs/tf2/lib/python3.7/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 4867: invalid start byte
哥,处理数据出现这个问题,linux下跑也是这样