Bug in `pack_float16()`

oyvindronningstad commented 1 year ago

I did some cursory testing of your float16 functions (out of curiosity, since I recently wrote my own conversion functions float16 <-> float32).

It seems your pack_float16() doesn't give the same answer as mine for some values. For small numbers, it could maybe be attributed to different rounding rules, but for high numbers it does seem wrong, for example for the float32 number represented in raw hex as 0x47000000, i.e. 32768.0, which has an exact representation in float16: 0x7800, but is rounded to infinity by pack_float16().

I did read the document you referenced in half_float_tables.py, and it does not specify any rounding mode, so I don't know how it handles that. My code is written to do "Round to nearest, ties to even", which I tested against numpy.

I looked at the code and realized that pack_float16() being wrong is not critical, I just wanted to let you know.

If you're curious, my code and tests can be found here: https://github.com/NordicSemiconductor/zcbor/pull/285

dolfandringa commented 1 year ago

Debugging here. Python's struct.(un)pack can handle the value correctly:

>> struct.unpack('e', bytearray([0x00, 0x78]))
(32768.0,)

>>> struct.pack('e', 32768.0).hex()
'0078'

It is definitely an issue with the c implementation. Encoding with the pure python encoder (cbor2.encoder.dumps) vs the c encoder (when supported) (dumps) and canonical=True on both (aka, the smallest possible size) results in different representations.

from cbor2 import dump, encoder
>>> dumps(32768.0, canonical=True).hex()
'fa47000000'
>>> encoder.dumps(32768.0, canonical=True).hex()
'f97800'
>>>

So it's probably somewhere here: https://github.com/agronholm/cbor2/blob/cabfde513beceeb8ee16db6e93c3393ee1932233/source/encoder.c#L1689

dolfandringa commented 1 year ago

Just commenting for future me (or anyone else interested in debugging):

I started the debugger to figure out where its going wrong, and I'll try to find time to continue that. But for anyone that is interested, this is basically how to get started with debugging: https://llllllllll.github.io/c-extension-tutorial/gdb.html Using this snippet will bring you to the line mentioned above.

import _cbor2
_cbor2.dumps(32768.0, canonical=True)

Bringing it down to unpack_float16, which encodes 124 as infinite:

p pack_float16(32768.0)
$1 = 124
p unpack_float16(124)
$2 = inf

Check: https://evanw.github.io/float-toy/ (thanks @Sekenre ). Bit 11 is wrong in the unpack_float16

agronholm / cbor2

Bug in `pack_float16()` #157