drom / LEB128

Little Endian Base 128 converters
MIT License
8 stars 3 forks source link

Length of 64 bits values #7

Open piranna opened 7 years ago

piranna commented 7 years ago

I have been doing some maths and seems to me that 64bits numbers will need 10 bytes to represent them instead of 9 as we were talking about...

bytes = ceil(bits / 7)

  7 bits / 7 =  1    byte
  8 bits / 7 =  1.14 bytes -> 2 (+1 byte)
 16 bits / 7 =  2.28 bytes -> 3
 24 bits / 7 =  3.42 bytes -> 4
 32 bits / 7 =  4.57 bytes -> 5
 40 bits / 7 =  5.71 bytes -> 6
 48 bits / 7 =  6.85 bytes -> 7
 56 bits / 7 =  8    bytes
 64 bits / 7 =  9.14 bytes -> 10 bytes (+2)
...
112 bits (14 bytes) / 7 = 16    bytes
120 bits (15 bytes) / 7 = 17.14 bytes -> 18 bytes (+3)
128 bits (16 bytes) / 7 = 18.28 bytes -> 19 bytes

And so on... I think the 64 bits version is wrong due to that, what do you think?

drom commented 7 years ago

@piranna yes, you right. i64 and u64 may occupy 8 byte + 1 bit (extra byte). It also means that 7 MSB bits in that 9th byte should be 0. Or error.

piranna commented 7 years ago

It also means that 7 MSB bits in that 9th byte should be 0. Or error.

What would be better, an error flag or a "need more data" one? I think they would be the same in that case...

drom commented 7 years ago

for packed u64 or i64:

abcdefg0 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000

or for u32 and i32:

abc0000 1000000 1000000 1000000 1000000

for i8 or u8

abcdefg0 1000000

any non-0 bit in [a-g] position is an error. I think.

piranna commented 7 years ago

What are you talking about? According to spec, all bytes must have 1 as their higher bit except the highest byte (the last one on the stream), that must be 0 to indicate there's no more ones, so LEB128 is able to encode numbers infinitely long.

drom commented 7 years ago

yes, but when you unpacking number, and you know, that at certain position in the stream you expect certain type of field, it is reasonable to expect that particular part of stream obey additional rules.

at least a position in the last possible byte HAS to be 0

piranna commented 7 years ago

I think we could do the check and set the error and the need-more-data status on a status line, but then the bytes length count would be more complicated becaude we would not know how much bytes got the imput data be, so probably it would be a good idea to make that code and independient module, how do you see it?

El 16/3/2017 9:29 AM, escribió:

I see. Ok, according spec, that bits are sign extension, so zero for positive and unsigned, and ones for negatives.

El 16/3/2017 1:34 AM, "Aliaksei Chapyzhenka" notifications@github.com escribió:

yes, but when you unpacking number, and you know, that at certain position in the stream you expect certain type of field, it is reasonable to expect that particular part of stream obey additional rules.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/drom/LEB128/issues/7#issuecomment-286922781, or mute the thread https://github.com/notifications/unsubscribe-auth/AAgfvsakR2SzbJRy02DN2r0ruq0im9gqks5rmIOmgaJpZM4MePcw .

piranna commented 7 years ago

I see. Ok, according spec, that bits are sign extension, so zero for positive and unsigned, and ones for negatives.

El 16/3/2017 1:34 AM, "Aliaksei Chapyzhenka" notifications@github.com escribió:

yes, but when you unpacking number, and you know, that at certain position in the stream you expect certain type of field, it is reasonable to expect that particular part of stream obey additional rules.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/drom/LEB128/issues/7#issuecomment-286922781, or mute the thread https://github.com/notifications/unsubscribe-auth/AAgfvsakR2SzbJRy02DN2r0ruq0im9gqks5rmIOmgaJpZM4MePcw .