ssbc / bipf

Binary json codec optimized for in-place access
MIT License
48 stars 9 forks source link

Figure out proper encoding for small numbers #2

Open staltz opened 4 years ago

staltz commented 4 years ago

From https://github.com/dominictarr/bipf/pull/1#issuecomment-685182454

btw, i didn't quite consider this production ready yet. it currently makes small numbers bigger, because a double is 8 bytes. was considering making it into two varints as in scientific notation 10e1000 or something like that, but need to figure out a performant way to encode/decode them

dominictarr commented 4 years ago

integers is easy, just use varints. all the lengths use varint too, so use the same varint. reading UInt32LE is fast, but since all the fields have a length, then a number already uses a varint. considering I don't know why I didn't just use varint in the first place.

for decimals is more complicated. could do it as two varints: XeY: varint(x)*Math.pow(10, y) but it would probably be better if it was to the power of 2.

that would be something like the internal double representation, surely.

dominictarr commented 4 years ago

so, in javascript land, probably the easyiest way to get that would be to write to a buffer with writeDoubleLE and then read out the bits that make the parts of the float.

https://en.wikipedia.org/wiki/Double-precision_floating-point_format

first bit is sign, then next 11 bits are exponent then next 53 bits are fraction. so I think just take the sign and fraction and put that in one varint, then the exponent in the other. in javascript I think it would be easiest to do via the buffer, but implementing it in other languages it should be possible to reinterpret the bits of the double as a 64 bit integer.

dominictarr commented 4 years ago

btw, definitely use Little Endian and not Big Endian. LE has won on modern hardware, and also web assembly.

dominictarr commented 4 years ago

that's the most obvious way, but there are probably more compact or faster ways