ICRAR / ijson

Iterative JSON parser with Pythonic interfaces
http://pypi.python.org/pypi/ijson/
Other
830 stars 51 forks source link

Memory leak on exception handling with yajl2_c backend #97

Closed mhugo closed 1 year ago

mhugo commented 1 year ago

Describe the bug

I think I found a memory leak with the yajl2_c backend when ijson.items raises an exception, some memory does not seem to be correctly deallocated.

**How to reproduce***

Here is a small Python code to reproduce the issue:

import resource
import ijson
from ijson import JSONError

def memusage():
    """memory usage in MB. getrusage defaults to KB on Linux."""
    return str(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss / 1e3)

print("starting memory usage:", memusage(), "MB")

for _ in range(10000):
    try:
        next(ijson.items(b"XXX", "whatever"))
    except JSONError:
        pass

print("memory usage after ijson calls:", memusage(), "MB")

Example output:

$ IJSON_BACKEND=yajl2_c python test_memleak.py 
starting memory usage: 10.376 MB
memory usage after ijson calls: 651.016 MB

$ IJSON_BACKEND=yajl2_cffi python test_memleak.py 
starting memory usage: 19.068 MB
memory usage after ijson calls: 19.068 MB

Expected behavior

The same run, but with less memory consumed :)

Execution information:

rtobar commented 1 year ago

Thanks @mhugo for the concise and useful report. I just double-checked I can reproduce the issue on my side. I'll have a look now to see if I can quickly find out the culprit, otherwise it might take a bit longer.

rtobar commented 1 year ago

@mhugo: found it and potentially fixed it, please give it a try under the issue-97 branch.

It indeed was very specific to the yajl2_c backend, but it actually affected all generator functions (basic_parse, parse, kvitems and items), but not the async or push interfaces.

mhugo commented 1 year ago

Hi @rtobar Thank you very much for your quick answer !

I confirm the problem is solved on the issue-97 branch. I've tested it on my small test and on my application code and I can't see the memory leak anymore.

When do you think would be the next release that will include this fix ?

rtobar commented 1 year ago

@mhugo thanks for the confirmation :+1:

I've put the fix now in master and will release 3.2.1 after that with the fix (and support for Python 3.12, which was fixed recently as well). I'm closing this issue now though as the fix is already available.

rtobar commented 1 year ago

3.2.1 has now been published

mhugo commented 1 year ago

Excellent ! Thank you very much again for your reactivity