When using output formatting in node parsing to capture the 8th token (pronunciation) for ipadic, the index out of range error message that is visible in mecab is not captured by natto-py.
To reproduce:
from natto import MeCab
nm = MeCab('-F%m,%f[0],%f[1],%f[8]')
for n in nm.parse('私はアシャです', as_nodes=True):
print(n.feature)
...
私,名詞,代名詞,ワタシ
は,助詞,係助詞,ワ
MECAB_NBEST request type is not set
Traceback (most recent call last):
File "/usr/home/buruzaemon/dev/github/natto-py/natto/mecab.py", line 400, in __parse_tonodes
rawf = self.__ffi.string(sp)
File "/usr/home/buruzaemon/dev/github/natto-py/.py35env/lib/python3.5/site-packages/cffi/api.py", line 288, in string
return self._backend.string(cdata, maxlen)
RuntimeError: cannot use string() on <cdata 'char *' NULL>
Compare with similar logic using mecab:
$ mecab -F'%m,%f[0],%f[1],%f[8]\n'
私はアシャです
given index is out of range
When using output formatting in node parsing to capture the 8th token (pronunciation) for ipadic, the index out of range error message that is visible in
mecab
is not captured by natto-py.To reproduce:
Compare with similar logic using
mecab
:Discovered indirectly in issue "natto.api.MeCabError: MECAB_NBEST request type is not set" error under some circumstances.