ubjson / universal-binary-json

Community workspace for the Universal Binary JSON Specification.
116 stars 12 forks source link

Specify framing of UBJSON in streams #83

Closed Elizafox closed 7 years ago

Elizafox commented 7 years ago

For some background about what I mean, see the Wikipedia article on JSON streaming.

Right now as far as I can tell, the assumption is you have a streaming parser, and concatenated UBJSON is the norm. You may not always have a streaming parser, and even so, streaming parsers come with their own complexity issues (suspending/resuming parse state) and raise the bar for parser implementation authors. It's also useful to know in advance (especially with things like C) how big of a buffer you need to allocate for a frame.

I think it would be much easier to know how to handle the case if UBJSON had some standard ways of framing UBJSON; at the very least, it would be ideal to make current practise explicit.

Here's what I propose for standardising (more could be added):

rkalla commented 7 years ago

Eliza, I'm closing this because I feel this has been addressed but I'll give you some back story.

When UBJSON was originally speced in 2011, there were NO length indicators, absolutely everything was streaming (except for Strings).

Performance-minded discussions, motivated by exactly the points you make above are what drove the inclusion of length specifications through 2012-15 both on the types as well as containers.

Further optimizations were made around 14-15 that allowed type markers to be omitted from containers when lengths were specified allowing for some really efficient data representation for high performance applications (this is how I mean it has been addressed).

The spec will continue to support both use cases and not drop one in favor of the other.

The only enhancement we could never settle on (for performance) was the type-definition system... it increased the complexity of the specification by an order of magnitude, but shrunk data representation by the same amount... discussing these trade offs takes years.