Closed buruzaemon closed 9 years ago
Decoding Python 2.7 strings should use the "charset" (character encoding) used internally by MeCab.
This means that the user needs to keep in mind the "charset" being used by MeCab. Might need to add a Wiki page on confirming the system dictionary charset from the command-line.
Done in 0.0.6 release.
Resolved 2014-11-21. This issue was ported from Bitbucket and is archived for historical reasons.
From Porting your code to NLTK 3.0: ...
NLTK3 requires all text input to be unicode and always return text as unicode
Enhance the behavior of natto-py under Python 2.7 to make sure that this behavior is consistent. Python 3 behavior should be consistent with the above approach.
Originally opened 2014-11-12. This issue was ported from Bitbucket and is archived for historical reasons.