Closed akorobko closed 4 years ago
Thanks for the fix! I notice that one of the build checks failed, the log is huge but this is the important part:
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/tmp/pip-req-build-e93f250n/build/libmecab/mecab/src -I/opt/_internal/cpython-3.8.0/include/python3.8 -c src/MeCab/MeCab_wrap.cpp -o build/temp.linux-x86_64-3.8/src/MeCab/MeCab_wrap.o -Wno-unused-variable
src/MeCab/MeCab_wrap.cpp: In function ‘void SwigPyStaticVar_dealloc(PyDescrObject*)’:
src/MeCab/MeCab_wrap.cpp:3120:3: error: ‘_PyObject_GC_UNTRACK’ was not declared in this scope
_PyObject_GC_UNTRACK(descr);
^~~~~~~~~~~~~~~~~~~~
src/MeCab/MeCab_wrap.cpp:3120:3: note: suggested alternative: ‘PyObject_GC_UnTrack’
_PyObject_GC_UNTRACK(descr);
^~~~~~~~~~~~~~~~~~~~
PyObject_GC_UnTrack
I don't entirely follow what's going on, but it looks like this might be due to a change in the Python API in 3.8. In that case I guess the cause here is SWIG not supporting Python 3.8. Based on the SWIG changelog it looks like it can be fixed with a more recent version of SWIG (the test seems to be using SWIG 3.0).
@polm Hi, Paul. I have upgraded SWIG to 4.0.1 and it seems to be fine now. Surprisingly, SWIG authors did not mark it as released, so, you cannot find it prebuilt anywhere. But for the travis build, it works fine. Let me know if you can merge. Thanks!
Thanks Alexey, unfortunately I am not actually a maintainer of this project and cannot merge the PR.
@SamuraiT, @zackw Zack, Tatsuro, could you help with this?
@polm Thanks. Do you know who would be the right person to reach?
Zack and Tatsuro are the only maintainers as far as I know.
Now build fails due to some travis internal issues. How do I re-trigger it?
Automatic download from sourceforge.net often fails. How about http://prdownloads.sourceforge.net/swig/swig-4.0.1.tar.gz
instead?
@KoichiYasuoka That is a good idea. I am not that familiar with sourceforge
. Actually was quite surprised it is still first choice platform for some projects. Making change now. Are you maintainer?
I'm not a maintainer here myself. I'm a user of mecab-python3, especially for my UD-Kanbun. I wish @SamuraiT and @zackw be aware of this PR.
I wish @SamuraiT and @zackw be aware of this PR.
I have tried to reach them over the email (from their web sites respectively), but no luck. Looks like this project is completely abandoned.
Well that's unfortunate but thanks for trying.
I would recommend using an alternative like my fugashi or natto-py, but for now I forked this repo and merged your PR so people can install from source if they want. Not sure if it makes sense to register a new name on pypi.
Based on the standards for abandonment this project can't be claimed on PyPI until a year after the last release, which would be April 2020.
@polm Thanks! Unfortunately, in my case, I need package which is installable from PyPI without requiring that MeCab itself being previously compiled and installed. That is the reason I do not switch to fugashi
myself.
I will give little bit more time for maintainers to notice this PR. They seem to be active on GitHub, but for some reason ignoring this project and all related correspondence. Maybe they are on vacation or something like that...
Thank you @polm for your fork with this PR. However, I think your fugashi
is more suitable if fugashi
can include build-bundled-libmecab.py for the environments without mecab-config
...
Good point, I opened an issue for bundling libmecab. Not sure when I'll get a chance to work on it but I'll keep it in mind at least.
I want to apologize for ignoring this PR for so long. I still don't have the time required to work on mecab-python3 properly. I also don't understand SWIG well enough to know whether this is the correct fix for the memory leak. And I'm also really hesitant to merge changes to the SWIG file because the SWIG file was originally copied verbatim from the MeCab library itself and so bugs in the SWIG file should ideally be addressed there ... but that project is pretty much dead too.
And on top of all that, I've been ignoring @polm's request to become a co-maintainer for almost a year now because I can't act on it. Only @SamuraiT has the ability to grant people privileges on either this repository or the PyPI package. I haven't heard from him in years myself. I can forward the request but I doubt it will do any good.
Put it all together and I'm inclined to say that you should wait until April, reclaim the PyPI package, and declare https://github.com/polm/mecab-python3 the new official repo for it. I'll cooperate with that to the extent I can.
I do not have any words to comment this.
Travis is failing again. This time it cannot fine flake8...
Travis is failing again. This time it cannot fine flake8...
It seems to be finding flake8 just fine; the problem is that flake8 is objecting to one of your changes:
$ /home/travis/build/SamuraiT/mecab-python3/.tox/py27/bin/flake8 .
./scripts/build_requirements.py:12:80: E501 line too long (202 > 79 characters)
ERROR: InvocationError: '/home/travis/build/SamuraiT/mecab-python3/.tox/py27/bin/flake8 .'
py27 finish: runtests after 0.41 seconds
The important line here is the second one: scripts/build_requirements.py
has a line that's too long.
Thanks! Trying to fight this now. Update: build is passing now.
Hey @zackw, thanks for responding! If you're able to merge things that puts us in a much better situation.
I think this is a good PR and should be merged.
The script in the first post makes it clear this resolves the memory use issues, and explanation of why it works is also good. Your point about this being an upstream issue makes sense but this repo only exists because upstream MeCab is abandoned. (The Sourceforge URL was a good catch though!)
Not sure it'll help but I tried mailing @SamuraiT myself. I'll see if maybe there's another way to get in touch with him. If not we can try asking PyPI about what to do, since the abandonment guidelines don't seem to deal with a half-abandoned state.
Anyway, thanks again @zackw for making time to deal with this. Hopefully things will be cleaned up before too long.
Merged. I can do a new PyPI release later today, but ... should I? Would that reset the clock on the PyPI abandonment process?
My guess is it would reset the clock on the abandonment process... :/
I'm inclined to say you should release it anyway, but maybe we can wait for a bit to see if SamuraiT replies and also to create an issue at pypa/pypi-support explaining the situation and asking what we should do.
@polm , @zackw , @akorobko I apologize that I didn't open an email and github for a while. I really didn't noticed the PR and the message. sorry for inconvenience guys, I set @zackw as the owner of mecab-python3 in pypi.
Since I'm inactive now, so it's great idea to transfer the owner. I want transfer to @zackw or create organization and transfer to it.
what would be the best ? I want to ask for the ideas. any ideas welcomed either create & transfer to organization or to @zackw.
@SamuraiT I think it would be better if you transfer ownership of this repo and the pypi package to @polm. I can only rarely find time to work on mecab. @polm is also much more knowledgeable about Japanese tokenization, so he will be a better maintainer than I can be.
@zackw thank you for advice. let me transfer to @polm then.
@polm, could you tell me your username of pypi ?
I'lll add you as an owner.
I found you on pypi so I added
@SamuraiT, @polm, @zackw Thank you!
@polm will you do a new PyPI release soon?
@SamuraiT Thank you very much! I'm glad we waited before asking about abandonment.
@akorobko I'll make a new release today or tomorrow with this fix. Thanks again for figuring out the SWIG issue!
@polm Thank you! Ping me if you ever encounter any other issues with this wrapper.
I just released 0.996.3 and confirmed the test script in the first post in this thread runs as expected, showing the memory issue is resolved. Thanks again to @akorobko for the fix and thanks to everyone involved in getting this out!
Thank you @polm and I've tested 0.996.3, however, in Debian without swig
the installation fails:
swigging src/MeCab/MeCab.i to src/MeCab/MeCab_wrap.cpp
swig -python -O -builtin -c++ -py3 -I/usr/include -o src/MeCab/MeCab_wrap.cpp src/MeCab/MeCab.i
unable to execute 'swig': No such file or directory
error: command 'swig' failed with exit status 1
Umm... automatic download of swig-4.0.1 does not work well...
Oops, mecab-python3-0.996.3 is now distributed by tar.gz (source distribution) only. Yes, now I understand why I need swig
separately installed before, since there is no binary wheel. But I think it's incompatible (in installing) from 0.996.2...
@KoichiYasuoka Thanks for the report, I opened a new issue at #39 to avoid cluttering this one, let's move discussion there.
This is to fix issue #34
When SWIG generates wrappers, currently it is instructed to not generate default constructors and destructors for the
Node
andPath
python objects. That causesSwigPyBuiltin_BadDealloc()
function to be called when garbage collector tries to remove object. That function does nothing, so,PyObject
leaks.Instead, what we need is to call
SwigPyBuiltin_destructor_closure()
, which deletesPyObject
, but keeps payload (data structure returned by MeCab itself) intact (since SWIG is instructed to not own it).To facilitate this, we should replace
with
Here is the test code to check that it works.
Before fix, the memory consumption grows from 7Mb to almost 200Mb. After fix, the consumption is stable.