Closed olt closed 11 years ago
Well, the first one will need three because you need one byte to give the new size. But the coordinates after that uses only 2 bytes until next change.
That link was very interesting about zig zag encoding. But if I understand right you sacrify one bit to tell there is a next byte. Is that correct? Then one byte will only give you the range from -64 to 63. Or do I miss the point?
Ok, I missed that the size is for all following coordinates.
For signed integers it is right that on byte can store from -63 to 64 (IIRC). But you always store numbers in the smallest possible way and do not need to change the sizes back and forth.
Yes, you are right. Very interesting. Here I think it comes down to statistics about real world data. What I have noticed is that it is quite big differences in overview maps and more detailed maps. More detailed maps often have only a few meters between the points and hen almost all vertex points is stored as 1 byte even with a precision of 1 decimal. With 1 decimal it can hold values from -12.7 to + 12.7.
Yeah, only a comparison with real world data will show which method is better. But varints could be used for all numbers (IDs and also first coordinates) and should benefit for small geometries.
Yes, that is an important point about ID and first coordinate.
Another question is what is fastest to parse in javascript, since that is another bottle neck.
hmm, it just struck me. With this varint, you will sacrifice one bit per byte, not only for the whole value. so 16 bits will only give -8191 to +8191 instead of -32767 to +32767 with INT16
I have added varInt as mthod 1 in the spec and very soon in the PostGIS and javascript implementations.
It looks very promising :-)
VarInt is now the one and only method in TWKB spec and PostGIS implementation
The 0.01 draft needs at least three bytes for all values above 128. Varint encoding as used by protobuf only uses two bytes for values between 128 and 16384.
https://developers.google.com/protocol-buffers/docs/encoding