Closed Eugen2525 closed 4 years ago
Hi I got the below error when using the tokenizer:
from parasol import Tokenizer t2 = Tokenizer(decompose=True)
error:
Traceback (most recent call last): File "D:/test_korean_tokenizer.py", line 13, in <module> t2 = Tokenizer(decompose=True) File "C:\..\AppData\Local\Programs\Python\Python37\lib\site-packages\parasol\tokenize.py", line 31, in __init__ self.spp.load(model.as_posix()) File "C:\..\AppData\Local\Programs\Python\Python37\lib\site-packages\sentencepiece.py", line 214, in load return _sentencepiece.SentencePieceProcessor_load(self, filename) OSError: Not found: "C:/../AppData/Local/Programs/Python/Python37/lib/site-packages/parasol/resources/decomposed/bpe.model": Illegal byte sequence Error #42
ok, resovled it. the problem was that the path contains hangeul characters.
Hi I got the below error when using the tokenizer:
error: