I am facing the following UnicodeDecodeError error:
File "/usr/src/app/server.py", line 188, in <module>
application = make_app(args)
File "/usr/src/app/server.py", line 166, in make_app
worker_pool = initialize_workers(services)
File "/usr/src/app/server.py", line 147, in initialize_workers
worker_pool[lang_pair] = TranslatorInterface(
File "/usr/src/app/server.py", line 17, in __init__
self.contentprocessor = ContentProcessor(
File "/usr/src/app/content_processor.py", line 18, in __init__
self.bpe_source = BPE(BPEcodes)
File "/usr/src/app/apply_bpe.py", line 37, in __init__
firstline = codes.readline()
File "/usr/local/lib/python3.9/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc0 in position 54: invalid start byte
For most of them (except "lv-en") the error goes away when I switch to the BPE model. However, SentencePiece models are the ones with better translation performance as per the shared metrics.
Hello,
I am facing the following
UnicodeDecodeError
error:for the following models:
For most of them (except
"lv-en"
) the error goes away when I switch to theBPE
model. However,SentencePiece
models are the ones with better translation performance as per the shared metrics.Please let me know if I am doing something wrong.