prototype normalized value encoding for exponent length zero

background

the VF128 header byte supports exponent lengths of 0-15 bytes and mantissa lengths of 0-3 bytes. this property has been used to optimize the storage of specific number forms. powers of two with large exponents can be encoded without a mantissa due to an implicit mantissa of one if no mantissa is present. the prototype also allows succinct encoding of integers by calculating an exponent if it is not present, that is to set the exponent to the number of bits following the leading one. the downside to the latter choice is that normal values with a zero exponent must take an additional byte to encode the zero exponent.

integers are currently allowed to omit their exponent
values with zero exponent but full entropy single-precision mantissa take 5 bytes of encoding space,
values with zero exponent but full entropy double-precision mantissa take 9 or 10 bytes of encoding space,

proposal

after some reflection, it seems a better behavior would be to make zero length exponent aka implicit exponent simply encode a zero exponent. this by itself would allow saving one byte of encoding space for numbers in the range -1.xxxxx...e0 and 1.xxxxx...e0. with this revised behavior, many single-precision values would encode in 4 bytes instead of 5, and many double-precision values in 8 bytes instead of 9. given VF128 is a floating-point format, it seems this would be a better choice, and with a tiny tweak to these rules, we could omit the implicit leading one, allowing us to efficiently encode -0.99999... through 0.99999... this would mean normalized device coordinates would receive the encoding space efficiency boost, instead of integers.

change zero-length exponent to mean zero exponent and implicit leading zero for the mantissa
single-precision normalized device coordinates would encode in 4 bytes.
double-precision normalized device coordinates would encode in 8 bytes,
integers will take an additional byte

gven VF128 is a floating-point format, we think giving the savings to normalized device coordinates is a better choice.

reasoning about this change further, due to the implicit leading one, this does not mean zero exponent within the IEEE 754 domain. a similar transformation to the integer transform could be applied, i.e. realignment, instead of checking that there is no fraction, the rules could check for fractions in the 23 significant bit range between 0 and 1 exclusive and renormalize them on expansion. -0.99999... through 0.99999 excluding +/-0 is preferable to +/-1.xxxxxx as the goal would be to compress normalized device coordinates as they would be more frequent. so the logic would be similar to subnormal, only with zero for the exponent when normalized to the range of a fixed point 23-bit fraction, as opposed to the minimum exponent for IEEE 754 subnormals.

michaeljclark / vf128

prototype normalized value encoding for exponent length zero #2

background

proposal