Open GoogleCodeExporter opened 9 years ago
Hey Alex,
Another user here! I imagine this problem is long gone for you by now but here
are my two cents.
I've had a problem like this in the past. For me it was just a text encoding
issue: my locale was utf-8 and the MeCab dictionary I had installed was euc-jp.
My solution might work for you:
$ python
Python 2.7.3 (default, Aug 1 2012, 05:14:39)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import MeCab
>>> m = MeCab.Tagger()
>>> encoding = m.dictionary_info().charset
>>> input_bytes = u"今おはよう。".encode(encoding)
>>> result_bytes = m.parse(input_bytes)
>>> print result_bytes.decode(encoding)
今 名詞,副詞可能,*,*,*,*,今,イマ,イマ
おはよう 感動詞,*,*,*,*,*,おはよう,オハヨウ,オハヨー
。 記号,句点,*,*,*,*,。,。,。
EOS
Original comment by richard....@gmail.com
on 18 Mar 2013 at 1:24
Original issue reported on code.google.com by
alexleav...@gmail.com
on 26 Aug 2012 at 7:09