cognitect / transit-format

A data interchange format.
1.87k stars 44 forks source link

Processing algorithm should be more explicit in the docs #17

Open jsl opened 9 years ago

jsl commented 9 years ago

As I've been reading the docs I've noticed a couple of things that I'd like to see more explicitly stated regarding encoding/decoding order:

First, the compression/decompression algorithm is described as recording the value, or looking up a value in a particular order, but since that depends on the tree traversal strategy we're using it seems like it would make sense to be explicit about this strategy. I assumed that the order of storing and looking up cache keys would be based on depth-first traversal, and this did turn out to be the case in the implementations I checked. Still, it feels like this should be explicit in the docs since it seems plausible that someone could write a client using another strategy for purposefully different semantics, or efficiency reasons unless we're explicit about this.

Second, in response to a recent question that I posted to the mailing list, Rich mentioned that writers should encode in either verbose or concise mode, but readers should be able to process constructs from verbose or concise mode at any time. This implies that the reader must always start recording cacheable values that it encounters from the moment it starts reading any kind of structure (via depth-first traversal), since there isn't anything in the syntax for indicating a 'mode switch' from verbose to concise mode.

Again, this latter point worked as I expected when playing with the Ruby implementation, but I'd like to see this explicitly mentioned.

Please let me know what you think. Thanks!

jlouis commented 9 years ago

I certainly agree on this one. The current processing order is exactly the same as in, say, YAJL, but it is nice to make this explicit.

timewald commented 9 years ago

You are correct on both points. The tree is always processed via a depth-first traversal, both in reading and writing. And, yes, JSON readers must assume that caching may be used, so must record cacheable values from the start, even if a writer never emits cache codes. I will update the spec to make these aspects more explicit.