PyThaiNLP / attacut

A Fast and Accurate Neural Thai Word Segmenter
https://pythainlp.github.io/attacut/
MIT License
79 stars 16 forks source link

UnicodeDecodeError Problem #7

Closed pitiphol closed 5 years ago

pitiphol commented 5 years ago

using tokenize function caused UnicodeDecodeError from load_dict function in utils.py

bact commented 5 years ago

Should be fixed by this commit: https://github.com/PyThaiNLP/attacut/commit/c8669828c94ac7059dd56fb75e216755667e139b

p16i commented 5 years ago

@pitiphol I've just published AttaCut with @bact 's fix.

Would you mind testing it? https://pypi.org/project/attacut/1.0.2.dev0/

Feel free to reopen the issue if there is still the problem