isagalaev / ijson

Iterative JSON parser with Pythonic interface
http://pypi.python.org/pypi/ijson/
Other
615 stars 134 forks source link

Prevent memory leaks #57

Closed rtobar closed 4 years ago

rtobar commented 8 years ago

The builder.containers list creates circular references between itself and the ObjectBuilder object that contains it. This means that garbage collection is taking care of disposing of these self-referencing objects from time to time. If no GC is enabled (e.g., when using timeit.timeit, or because intentionally turned off) these objects build up in memory, defeating the purpose of the items generator and those under it. By explicitly clearing up the builder's list before the builder gets out of scope we solve this.

To reproduce try:

import gc
import io
from ijson.backends import yajl2_cffi as ijson

stream = io.BytesIO(b'[' + b','.join( (b'{"a": "b", "c": 1, "d": [2,3,"4"]}',)*5000000 ) + b']')
gc.disable()
for _ in ijson.items(stream, "item"): pass

and see your memory usage grow.

craiglabenz commented 5 years ago

Whoa, this seems pretty important!

ltalirz commented 4 years ago

@rtobar @isagalaev As you can see from the comment above, people still end up on this repo from Google/Stackoverflow, assuming that this is the official source. The same happened to me, and it took me a while as well to realize that the active fork is https://github.com/ICRAR/ijson .

Would it perhaps be possible to

isagalaev commented 4 years ago

I don't know how to transfer open issues, but I could probably close them with a note that it could be reopened against the new fork. Will that work?

ltalirz commented 4 years ago

Sounds good, thanks! I realize now that transferring issues is anyhow only possible between repos of the same organisation account.

P.S. Here is the link on how to archive a repository.

isagalaev commented 4 years ago

Done closing the issues. Archiving the repo now. Thanks for prompting me to finally do it!