ubjson / universal-binary-json

Community workspace for the Universal Binary JSON Specification.
115 stars 12 forks source link

Degradation: using C for object keys #46

Closed kxepal closed 10 years ago

kxepal commented 10 years ago

In Draft 9 C marker was introduced as special optimization case for strings and actually was allowed to be used in objects for keys. Since Draft 10 objects received "optimized" format which implicitly assume that all keys are marked with S. So the data:

[{][C][a][U][1][}]

is invalid for Draft 10 since it have to be

[{][U][1][a][U][1][}]

This is known issue?

meisme commented 10 years ago

I believe so. If it was up to me I'd drop the C marker entirely.

ghost commented 10 years ago

@kxepal Can you clarify where in the spec I state that [C] is allowed as names in name-value pairs inside of Objects?

I don't explicitly state it is NOT allowed, but I do clarify in the spec that all "names" (or keys) are string values hence the [S] marker is not required (but the size is) -- it was not intended to allow [C] as a name value because it makes parsing suck and very inconsistent (either find 'i' or treat the value as a CHAR, etc.)

As for the motivation of [C], it is not unlike #43 -- in certain (semi-common) cases, the space savings is 50% over a standard [S]. It was a simple, relatively unobtrusive addition which is why it was made.

This is the motivation behind the addition of most of the numeric types; in service to space-savings.

kxepal commented 10 years ago

@thebuzzmedia that's a problem that this moment is missed in Draft 10 spec. Here is your comment about Draft 9: https://github.com/thebuzzmedia/universal-binary-json/issues/21#issuecomment-16361320

ghost commented 10 years ago

@kxepal Ahhh, ok - I made the [C] change (and that comment chain) before I made the "all key names are Strings; therefore we will remove the 'S' marker" decision.

What pushed me over the edge with "all key names are Strings" is because I specifically had people asking to make it ANY time (key == [i][8] would have been valid) -- this was a cool idea for UBJSON in a vacuum, but horrible for UBJSON -> JSON compatibility, so once I saw the direction people wanted to go, I firmed up the spec that the key names HAD to be Strings. And since they were Strings, we could get rid of the [S] marker.

At that point, you are right, that invalidated the use of [C] or anything else as a key type because it would make parsing shoddy.

I need to add clarification to the spec that key types are ONLY Strings (unless someone wants to submit a proposal to the contrary).

Would that close this issue for you?

kxepal commented 10 years ago

+1 for clarification of object keys and C marker definition (additionally, to not leave confusion about it nature after reading related section)

Not a big loss, but brings more order to specification. Yes, it will.

ghost commented 10 years ago

Your wish is my command! :)

I'll add the clarification then close this bug out.

ghost commented 10 years ago

I went back through and read the definition of both the OBJECT construct and the CHAR construct and I think it is pretty clearly implied that you cannot use [C] as a name marker in the object (Especially since the name construction itself is incomplete and implies a STRING in every case).

I made a few edits and then backed them out when I realized it felt forced and unnecessary.

If you truly feel clarification is needed around a certain part of the spec re-open this and let me know where and I'll add it.