Closed sj6077 closed 4 years ago
Hi There, We are checking to see if you still need help on this, as this seems to be considerably old issue. Please update this issue with the latest information, code snippet to reproduce your issue and error you are seeing. If we don't hear from you in the next 7 days, this issue will be closed automatically. If you don't need help on this issue any more, please consider closing this.
System information
Describe the problem
I got the below error message.
UnicodeDecodeError: 'utf8' codec can't decode byte 0xa8 in position 0: invalid start byte
The word in the vocab.txt is ¨C. I wonder how can I fix it. I cannot download BookCorpus now, so I can't regenerate the preprocessed data, too. Is there any idea to handle it?Source code / logs
UnicodeDecodeError: 'utf8' codec can't decode byte 0xa8 in position 0: invalid start byte