samipshah / simpleubjson

Automatically exported from code.google.com/p/simpleubjson
BSD 2-Clause "Simplified" License
0 stars 0 forks source link

[S] marker used for names in Object's name/value pairings #9

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Repeat steps:

>>> import simpleubjson
>>> ubjdata = simpleubjson.encode({'hello' : 'world'})
>>> ubjdata
'{Si\x05helloSi\x05world}'
>>> simpleubjson.pprint(ubjdata)
[{]
    [S] [i] [5] [hello]
    [S] [i] [5] [world]
[}]

The output should be:

'{i\x05helloSi\x05world}'

[{]
   [i][5][hello][S][i][5][world]
[}]

Tested on:

Python 2.7.3 (default, Feb 27 2014, 19:58:35) 

DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=12.04
DISTRIB_CODENAME=precise
DISTRIB_DESCRIPTION="Ubuntu 12.04.4 LTS"

>>> simpleubjson.version
<module 'simpleubjson.version' from 
'/usr/local/lib/python2.7/dist-packages/simpleubjson-0.7.0-py2.7.egg/simpleubjso
n/version.pyc'>

Additional information:

According to Draft 9 specification (Type Reference - Container 
Types - Object Type):
"NOTE: The [S] (string) marker is omitted from each of the names
in the name/value pairings inside the object. The JSON specification
does not allow non-string name values, therefore the [S] marker is 
redundant and must not be used."

Also decoding ubjson binary conforming to the above rule fails with 
simpleubjson.

Original issue reported on code.google.com by mihon...@gmail.com on 13 Nov 2014 at 11:02

GoogleCodeExporter commented 9 years ago
Thanks for report. Simpleubjson uses Draft-8 by default. This need to be fixed, 
but Draft-9 is also implemented and you may use it now:
>>> simpleubjson.encode({'hello' : 'world'}, spec='draft-9')

Original comment by kxepal on 13 Nov 2014 at 11:13

GoogleCodeExporter commented 9 years ago
Sorry, Draft-10 is actual now, Draft-9 is implemented. Anyway, will fix this 
soon.

Original comment by kxepal on 13 Nov 2014 at 11:16

GoogleCodeExporter commented 9 years ago
Explicit draft-9 spec definition does not seem to help:

import simpleubjson
>>> simpleubjson.encode({'hello' : 'world'}, spec='draft-9')
'{Si\x05helloSi\x05world}'

The extra 'S' marker still exists for name string 'hello'.

I'm implementing a C++ library against draft-9 specification. My goal is to 
test the C++ library against simpleubjson.

Original comment by mihon...@gmail.com on 19 Nov 2014 at 7:44

GoogleCodeExporter commented 9 years ago
Yes, that's a bug. Much likely my implementation reference to the days when 
Draft-9 didn't change strings. Thanks!

Original comment by kxepal on 19 Nov 2014 at 9:27

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
After re-checking the draft versions and discussions it seems that the 'S' 
marker from name strings has been removed only in draft-10. So this was 
apparently my mistake. Since I'm developing draft-9 compatible parser it will 
use the 'S' markers. Thanks!

Original comment by mihon...@gmail.com on 3 Dec 2014 at 6:59

GoogleCodeExporter commented 9 years ago
Is there any estimate when simpleubjson supporting the latest draft (draft-10) 
currently published on ubjson.org will become available? 

Original comment by mihon...@gmail.com on 10 Dec 2014 at 6:06

GoogleCodeExporter commented 9 years ago
I'll handle it over this weekends when I have enough time to make it in single 
shot. Draft 10 contains significant changes for Array type which need to be 
handled carefully and right. Stay tuned!

Original comment by kxepal on 10 Dec 2014 at 6:20

GoogleCodeExporter commented 9 years ago
Cool! I had to implement typed arrays to my C++ implementation. Originally I 
thought to stick with draft-9 and use String-type for raw binary data, but the 
requirement for UTF-8 encoding makes that impossible. Anyway, I will verify my 
own implementation's compliance with draft-10 against simpleubjson.

Original comment by mihon...@gmail.com on 10 Dec 2014 at 11:36

GoogleCodeExporter commented 9 years ago
I know that feeling. Hopefully, with draft 10 you may use typed array of uint8 
to handle any binary data with just 8±2 bytes overhead depending on it size.

Original comment by kxepal on 10 Dec 2014 at 11:42