Open Parakleta opened 8 years ago
I'm not so good at writing Rust yet, but here's another option from Lua which doesn't use ldexp
.
local function read_half_float(u16val)
local neg = u16val & 0x8000
local val = u16val & 0x7fff
if val == 0 then return neg == 0 and 0.0 or -0.0 end -- Shortcut common case
if val < 0x0400 then -- SubNormal
return (neg == 0 and 0x1p-24 or -0x1p-24) * val
end
local single = (neg == 0 and 0x38000000 or 0xb8000000) + (val << 13)
if val >= 0x7C00 then
single += 0x38000000 -- Inf and NaN
end
return ("f"):unpack(("I4"):pack(single))
end
The unpack
and pack
step at the end is basically just a transmute
. The 0x1p-24
I'm not sure how to translate into Rust, but is equivalent to 2.0^-24
.
Here's where the decoding happens: https://github.com/BurntSushi/rust-cbor/blob/master/src/decoder.rs#L151-L157
According to comments, my past self seems to have thought it was wrong too. :-)
I had a go at putting something together, I don't know enough about Rust design patterns to know if it's good or not, but here it is...
let f: f32 = match (n&0x8000 != 0,((n&0x7fff) as u32) << 13) {
(false,0) => 0.0,
(true, 0) => -0.0,
(false,x) if x < 0x00800000 => 7.27595761e-12f32 * (x as f32),
(true, x) if x < 0x00800000 => -7.27595761e-12f32 * (x as f32),
(false,x) if x < 0x0f800000 => unsafe { transmute (x+0x38000000u32) },
(true, x) if x < 0x0f800000 => unsafe { transmute (x+0xb8000000u32) },
(false,x) => unsafe { transmute (x+0x70000000u32) },
(true, x) => unsafe { transmute (x+0xf0000000u32) },
};
I wonder whether transmute is the right idea from a Rust safety point of view. The code I provided is fine so long as you are certain that the floating point is IEEE, but if it is not then probably better to use the example C code and swapping the ldexp(x,y)
calls with (x as f32) * (2.0f32).powi(y)
or similar so that everything is safe and stable. I wonder if the compiler/optimiser is smart enough to turn (x as f32) * (2.0f32).powi(y)
into a bit-twiddle + transmute anyway.
How about optional https://crates.io/crates/half ? Here hoping to see it in CBOR one day.
I have a CBOR encoder/decoder for Lua which uses the canonicalization recommendations from the RFC, specifically that floating point values should be stored in the smallest representation that does not result in loss of information. This provides a decent saving on structures that have a lot of floating point values many of which are just small integers. Unfortunately this fails spectacularly for every value except
0.0
with the f16 decoding behaviour in rust-cbor.An example correct decoding routine is available in Appendix D of the RFC. The C version focuses on breaking down the f16 and building a new f32 float with
ldexp
. The Python version uses bit manipulation to convert the f16 directly into an f32.