Note: I've aimed to keep backwards API compatibility but I've made some additions, how these new additions actually work are open to discussion.
I think that this patch aims to address issues:
1 - Make more awesome
14 - Steal stream ideas from yajl-ruby
31 - Undefined symbols in python 3.2
32 - Memory leak when dumps()'ing large objects
The main improvement is that you can now write something like:
import yajl
import sys
for i in yajl.Decoder(allow_multiple_values=True,stream=sys.stdin):
print i
Which will let you iterate over a stream of json objects read from the processes std input channel. It's not too slow either, on my core2duo a yajl based producer / consumer connected via a unix pipe with a very basic object can process about 100,000 json objects/sec.
I don't think the iterator method will handle non blocking sockets very well at the moment, it may not handle them at all - I've not tested it yet. If the file object is in blocking mode the iterator handles that fine.
I've also fixed a number of bugs relating to memory management (including issue #32) along the way.
Summary of changes.
Maintain backwards compatibility
Upgrade to newer version of yajl
Change yajl decoder class to decode objects into an internal python list (as each read while iterating my decode 0 to >1 objects yet we must only return 1).
Add iterator method to decoder
Add len() method to the decoder (though I'm not really sure if this is needed - it returns the size of the internal list).
decoder now takes 3 optional arguments when being initialised.
allow_multiple_values - true / false - allow yajl to continue decoding past the first value
stream - a file like object to read from when iterating -
bufsize - integer - the size of each read performed internally when iterating over a stream
The unit tests still pass and the current version also supports python 3. Though I really should write some more unit tests for the new features and some documentation to accompany them.
Note: I've aimed to keep backwards API compatibility but I've made some additions, how these new additions actually work are open to discussion.
I think that this patch aims to address issues:
1 - Make more awesome
14 - Steal stream ideas from yajl-ruby
31 - Undefined symbols in python 3.2
32 - Memory leak when dumps()'ing large objects
The main improvement is that you can now write something like:
Which will let you iterate over a stream of json objects read from the processes std input channel. It's not too slow either, on my core2duo a yajl based producer / consumer connected via a unix pipe with a very basic object can process about 100,000 json objects/sec.
I don't think the iterator method will handle non blocking sockets very well at the moment, it may not handle them at all - I've not tested it yet. If the file object is in blocking mode the iterator handles that fine.
I've also fixed a number of bugs relating to memory management (including issue #32) along the way.
Summary of changes.
The unit tests still pass and the current version also supports python 3. Though I really should write some more unit tests for the new features and some documentation to accompany them.