rtyler / py-yajl

py-yajl provides Python bindings for the Yajl JSON encoder/decoder library
http://rtyler.github.com/py-yajl
74 stars 18 forks source link

Memory leak when dumps()'ing large objects #32

Open brendano opened 13 years ago

brendano commented 13 years ago

Hi, I see a memory leak in the following code. The memory usage of the Python process grows without bound. (Python 2.7) (The "psutil" module stuff is just diagnostics, it can be removed)

import sys,os
#import json
import yajl as json

import psutil
myproc = psutil.Process(os.getpid())

data = json.loads(sys.stdin.read())
for i in xrange(1000000):
  d2 = json.dumps(data)
  if i % int(1e4) == 0:
    print>>sys.stderr, myproc.get_memory_info()

It is data dependent. Here's a 7KB data file that causes it.

% curl http://brenocon.com/yajl_leak.json | python mem_test.py 
meminfo(rss=6778880, vms=89104384)
meminfo(rss=399265792, vms=481546240)
meminfo(rss=791748608, vms=874004480)
...

However, certain other data files don't cause it. For example, if I do

data = {'a': 5}

I don't see any leak. Maybe it has something to do with having lots of different strings?

blake-r commented 10 years ago

Ya, we have same problem now.

blake-r commented 10 years ago

Strange, but my install still leak. :(

The only difference between non-leak and leak runs is