lloyd / yajl

A fast streaming JSON parsing library in C.
http://lloyd.github.com/yajl
ISC License
2.15k stars 435 forks source link

A copy of indent_string from yajl_gen_config args should be stored #198

Open mkorvas opened 7 years ago

mkorvas commented 7 years ago

I am setting up an incremental JSON parser/dumper using the Python bindings for yajl-py, and I have observed that YAJL behaves strangely when I configure it to use as indent string a string computed on the fly (for code that reproduces the issue for me, see below).

From looking at the YAJL code, I figured that it stores merely a pointer to the string that should be used for printing indentation, that is, to the original string constructed by the library user. This presents an issue when the lib is to be used in higher-level languages, such as Python, where it is harder for the caller to make sure that the memory used to store the indent string initially is not overwritten (e.g. on the next run of the garbage collector). The YAJL library itself should prevent such issues by creating a copy of the indent string for its own use.

This is a minimal piece that reproduces the issue for me:

#!/usr/bin/env python2.7

import sys
from yajl import YajlGen
from yajl.yajl_gen import yajl_gen_indent_string

class ListDumper(object):
    def __init__(self, gen, indent):
        self._gen = gen
        self._gen._yajl_gen('yajl_gen_config', yajl_gen_indent_string,
                            ' ' * indent)
    def dump(self, obj, outfile):
        # Do a small string operation, this apparently reuses memory
        # earlier used to store (' ' * indent) in my environment.
        'subst ' + str(42)
        self._gen.yajl_gen_array_open()
        for elem in obj:
            self.dump(elem, outfile)
        self._gen.yajl_gen_array_close()
        outfile.write(self._gen.yajl_gen_get_buf())

gen = YajlGen(beautify=True)
dumper = ListDumper(gen, indent=4)
dumper.dump([[]], sys.stdout)

The expected output is:

[
    [

    ]
]

while this program actually outputs:

[
subst 42[

subst 42]
]