Closed johnw3d closed 2 years ago
Hi, John. Thanks for reporting. It would be great if you could make a full minimal example that exhibits the behavior.
Hi Igor. I've been quite busy lately, and having a little trouble getting a minimal example without all my additional rigging and large grammars and test source sets. Will get you something as I can, and check (for the 4th time!) it is not my rigging that has the leak. I'll report back here again soon.
Hi, John. Do you still experience these problems with the new version? I've just released 0.15.0 and a lot has changed since 0.12 so you might give it a try.
@johnw3d Hi. I'm closing this as stale and non-reproducible. In case problems still persist with the newest version feel free to reopen.
Description
The GLR parser seems to have a memory leak on repeated uses.
What I Did
I'm using the GLR parser with a Korean grammar to do phrase-structure analysis on a large training corpus of Korean text sentences, so calling GLRParser.parse() hundreds of thousands of times on sentences averaging 50-100 characters. Memory use climbs relatively quickly, roughly 10's or 100's of MB per 100 calls.
It may well be some form of impure structure, since this happens whether I reload the grammar & re-instance the GLRParser on every sentence, or if I do that once and call parse() on the same instance repeatedly.
Also, the parser does not seem to be re-entrant, perhaps also pointing at some impure structures somewhere; I tried to run the thing in a thread-pool setup and it failed to work with odd errors (which I can detail if wanted), but it does work using process-pools.
I'll work up a reproducing rig, if that's needed.
Thanks, John.