rtyler / py-yajl

py-yajl provides Python bindings for the Yajl JSON encoder/decoder library
http://rtyler.github.com/py-yajl
74 stars 18 forks source link

problem with decoding unicode #27

Open lloyd opened 13 years ago

lloyd commented 13 years ago

(move over here from yajl proper)

https://github.com/lloyd/yajl/issues/20

jerzyk commented 13 years ago

for the completion, here is original ticket content:

while using unicode data library is throwing exceptions:

In [1]: a = u'[{"data":"Podstawow\u0105 opiek\u0119 zdrowotn\u0105"}]'
In [2]: import yajl
In [3]: yajl.loads(a)
---------------------------------------------------------------------------
UnicodeEncodeError                        Traceback (most recent call last)
/home/cms/sobre-cms/<ipython console> in <module>()
UnicodeEncodeError: 'ascii' codec can't encode character u'\u0105' in position 19: ordinal not in range(128)

In [4]: import simplejson
In [5]: simplejson.loads(a)
Out[5]: [{u'data': u'Podstawow\u0105 opiek\u0119 zdrowotn\u0105'}]

same code for different libraries (here simplejson) working fine

teepark commented 13 years ago

fix at least for Decoder.decode() in cafdd07ee4c32239c8043b508d3f2fc842db2cae

http://docs.python.org/c-api/arg.html the issue is using the z# formatter for argument parsing. the docs there don't say how it encodes a unicode object into a char buffer, but it's clearly not "use utf8", so the change just gets explicit about that.

teepark commented 13 years ago

aaaand that change isn't decrefing the pybuffer in the success case. wait for a formal pull request pls