Closed graph226 closed 5 years ago
For modifying this, we can call parse
method before parseToNode
but I don't know whether it works or not 😇
like
def parseToNode(self, *args):
self.parse(self, *args)
return _MeCab.Tagger_parseToNode(self, *args)
Please give me your idea about this.
I got the same error and fixed it with the above workaround.
I investigated the reason of this bug.
In _wrap_Tagger_parseToNode
method, this line deletes buf2
because alloc2
is SWIG_NEWOBJ
.
https://github.com/SamuraiT/mecab-python3/blob/5ee7aa538c8408d61a42aedea9d2f000c86f1ca3/MeCab_wrap.cxx#L6527
In python 2, the buf2
is not deleted because alloc2
is SWIG_OLDOBJ
.
(MeCab_wrap.cxx is completely same as original @taku910's one. https://github.com/taku910/mecab/blob/master/mecab/python/MeCab_wrap.cxx .)
So, the reason of this bug is in SWIG_AsCharPtrAndSize
method.
I think this block has something wrong.
https://github.com/SamuraiT/mecab-python3/blob/5ee7aa538c8408d61a42aedea9d2f000c86f1ca3/MeCab_wrap.cxx#L3461-L3470
But I don't have the patch to solve this bug at this time. 😕
I got the same problem and found that using the latest version of MeCab solves the problem.
My environment:
This problem seems to be the same as the one reported in https://github.com/taku910/mecab/issues/5, and it has been solved by https://github.com/taku910/mecab/pull/24 merged in Feb 2016.
Alhough this problem occurs only in Python 3, it is not a matter of mecab-python3, but it seems to be a matter of memory management of MeCab itself.
Unfortunately, major package managers such as Homebrew and APT currently offer older version of MeCab based on the source in Feb 2013, which can be obtained from Google Drive.
To avoid this problem without using the workaround mentioned above, you need to build and install MeCab from the latest source on GitHub manually, and then reinstall mecab-python3.
@graph226 I believe this ought to be fixed by using the latest version of the package and the latest version of MeCab, but I cannot be sure because you did not provide a complete test case that I can run for myself. Could you please try your code again? Make sure to use mecab-python3 0.8.3, MeCab 0.996, and a current version of SWIG (I have 3.0.12).
It's been a long time since you reported this bug and perhaps you have moved on, so if I don't hear from you in a month I will close the bug (but feel free to reopen it if you don't get to this until after that, and it's still a problem).
Please see the spaCy issue linked above, which provides a Dockerfile and code to reproduce the issue. I think @orangain's explanation is exactly right.
@polm Thanks for the pointer. I think you're right. I am going to consider this bug a concrete reason why we need to ship binary wheels from PyPI with bundled libmecab, so it will be addressed by PR #18, which I will be reviewing and landing Real Soon Now. I'll leave the bug open till then.
Please try the release candidate available from https://test.pypi.org/project/mecab-python3/0.996.2rc2/ , this bug should be corrected. Thank you everyone for your patience. We plan to make a new official release in the next couple of weeks.
0.996.2 has been officially released and this issue should be corrected. Please file a new bug report if you are still having problems with parseToNode
.
When we use
tagger.parseToNode(text)
alone, sometimes we get such error as:To avoid this, put
tagger.parse(text)
before parseToNode.