dmeranda / demjson

Python module for JSON data encoding, including jsonlint. See the project Wiki here on Github. Also read the README at the bottom of this page, or the project homepage at
http://deron.meranda.us/python/demjson/
Other
302 stars 76 forks source link

Option to mandate unique key names in objects #1

Closed behnam closed 10 years ago

behnam commented 11 years ago

RFC 4627 (Section 2.2. Objects) allows duplicate key names in objects, but this is not the case for many of the applications. So let's add an option to make key name uniqueness mandatory.

I'm going to work on this. Just let me know if you have a good short option name for this. I'm thinking of "uniq-keys" right now.

dmeranda commented 11 years ago

Hi Behnam,

Thanks for the suggestion. I should mention that I'm in the last phases of developing a pretty massive update to demjson, so unless you've done most of the work already or need it immediately you may want to hold off on doing the work unless you just feel like doing it.

Most of the forthcoming changes in the next release involve a much more sophisticated error handling and reporting system; primarily because most of the users of demjson seem to use it primarily for it's lint-like checking features. The new version in the works will already raise a warning for duplicate keys. I can certainly make it controllable whether duplicates raise a warning or an error.

I will try to see if I can get the new version pushed up to github before the end of this year.

Thanks again on your feedback. Deron

On Fri, Dec 14, 2012 at 11:51 PM, Behnam Esfahbod notifications@github.comwrote:

RFC 4627 (Section 2.2. Objects) allows duplicate key names in objects, but this is not the case for many of the applications. So let's add an option to make key name uniqueness mandatory.

I'm going to work on this. Just let me know if you have a good short option name for this. I'm thinking of "uniq-keys" right now.

— Reply to this email directly or view it on GitHubhttps://github.com/dmeranda/demjson/issues/1.

Deron Meranda http://deron.meranda.us/

behnam commented 11 years ago

Thanks for the note, Deron, Sounds like a pretty good plan. There's no rush here, so I'm gonna wait to get your update first.

Let me also add another point. The RFC uses "SHOULD" for key name uniqueness, so maybe it's even better to make it the default behavior, eventually. I think the best way to get there would be:

  1. Add two options (let's call them "uniq-keys" and "duplicate-keys", for now);
  2. Have the "duplicate-keys" implicitly enabled, BUT warn user (on stderr) that it's deprecated and suggest to use one the options explicitly;
  3. Change the default behavior after a year or so, with a major-version bump.

Thanks, -Behnam

dmeranda commented 11 years ago

A fundamental principle for demjson is that in its default operation (no options, etc.) that it will adhere as strictly as possible to the JSON specification. So it will always allow duplicate keys (on decoding) unless an option is explicitly given to treat these cases differently. Of course what it does if it gets duplicate keys is left unspecified by JSON—I could perhaps create a multidict in those cases with an option as well. BTW see RFC 2119 for exactly what words like "SHOULD" mean.

I should also note that in my new upcoming version demjson makes a distinction between "errors" and "warnings". Anything that is not strictly permitted by the JSON spec (or overridden by an option) will result in an error, and anything that is allowed but problematic (such as duplicate keys) will result in a warning.

Also, I only consider this an issue with JSON decoding. demjson does not provide any way to encode a JSON document that would contain duplicate keys, nor do I envision providing that ability unless some persuasive use-case comes up.

Deron Meranda http://deron.meranda.us/

behnam commented 11 years ago

Right, it's a matter of opinion how "strict" is defined based on the definition of "SHOULD". IMHO in an "strict" environment, user is not asking for a "loose" condition, which in our case is "existence of a valid reasons to have duplicate key names".

AFAIK, duplicate key names would result in either parse error or data loss in most applications, and I this sounds "stricter" than what we have here now.

Anyway, it's up to you. Will stay in touch. :)

dmeranda commented 10 years ago

After a long delay, I've finally released version 2.0. It can now warn about, or error, when it detects duplicate keys.

Check out http://deron.meranda.us/python/demjson/ for changes and documentation.