Closed ghost closed 11 years ago
This is about STC containers from issue #13 . With his syntax this array may be represented as:
[<] [i]
[0]
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[>]
with same 14 bytes, but with special markers to split his processing from regular array.
@syount I like the idea, more optimal representation of commonly used values is absolutely along the right lines of thinking.
That said, @kxepal pointed to our current thinking along these lines (STC) which is a bit more flexible and gives us the same wins.
I am going to close this request (for that reason) but would be very interested in knowing what you thought of the discussion over on Issue #13 if you had the time.
The current status is that:
I admit that it is entirely possible the importance I am putting around Point 3 is misplaced and my own invention, but I don't make the decision lightly. This is why we are still discussing it/sitting on it.
Would be nice to know what you thought over on #13 though if you had the time.
These digit literal representations are more versatile than the STC representation since they can provide their compact representation in a mixed type array.
I agree that @syount's proposal is different from STC.
The digit literals are useful as the length for small strings. For instance, [s][0]
is the empty string rather than [s][B][0]
, and [s][1][:]
is a colon rather than [s][B][1][:]
.
This proposal gets my vote, even though it has been closed.
A first example is STC. @breese your example is one char (ASCII) addition.
What if I want a string with 23.. 14... 18... lengths... it is ok only when it is fits into a one byte. Anyway... with high amount of literals we also loose speed and simplicity of analysis.
Just to clarify, I closed this not because it was a poor idea, but because it increases the complexity in generating (and especially parsing) UBJSON. We can absolutely consider this again in the future, I just want to get the high points of the spec done first before getting into micro-optimizations like this.
Wanted to know what you guys thought...
Given the JSON Array: [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ]
The current UBJSON is: [[] [i][0] [i][1] [i][2] [i][3] [i][4] [i][5] [i][6] [i][7] [i][8] [i][9] [i][10] []]
Whereas with digit literal types the UBJSON could be: [[] [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [i][10] []]
The space savings in this example are significant: 24 bytes vs 14 bytes
Thoughts?