snek_is_float returns false for e.g. -1.701412e+38

fanoush commented 5 years ago

not sure if it is a problem but this method https://github.com/keith-packard/snek/blob/a23f6dda65bd79b74edbed6272ab32406ee5f810/snek-poly.c#L44 does not handle correctly some valid float numbers

snek_poly_t v;
v.f=-1.701412e+38;
printf("%08x=%e %s\n",v.u,v.f,snek_is_float(v)?"true":"false");

prints ff000001=-1.701412e+38 false

I guess the mask should be 0x7f800000 as exponent+sign is 9 bits?

however when I tried assign and evaluate this in snek

x=-1.701412e+38
y=x

it worked fine

fanoush commented 5 years ago

also the value 7f800000 is printed by printf as 'inf' yet it returns false (even with the mask fixed as (v.u & 0x7f800000) != 0x7f800000 )

v.u=0x7F800000; // infinity
printf("%08x=%e %s\n",v.u,v.f,snek_is_float(v)?"true":"false");

7f800000=inf false is infinity valid number and should return true ? while 7f800001 is not valid number (printf prints 'nan') and this unlike infinity should return false?

fanoush commented 5 years ago

Anyway as the code is now this

    v.f=-1.701412e+38;
    printf("%08x=%e %s\n",v.u,v.f,snek_is_float(v)?"true":"false");
    v.u=0x7F800000;
    printf("%08x=%e %s\n",v.u,v.f,snek_is_float(v)?"true":"false");
    v.u=0x7F800001;
    printf("%08x=%e %s\n",v.u,v.f,snek_is_float(v)?"true":"false");
    v.f=sqrtf(-1);
    printf("%08x=%e %s\n",v.u,v.f,snek_is_float(v)?"true":"false");
    v.f=-sqrtf(-1);
    printf("%08x=%e %s\n",v.u,v.f,snek_is_float(v)?"true":"false");

prints

ff000001=-1.701412e+38 false 7f800000=inf true 7f800001=nan true ffc00000=-nan false 7fc00000=nan true

while it should perhaps be ff000001=-1.701412e+38 true 7f800000=inf true 7f800001=nan false ffc00000=-nan false 7fc00000=nan false

keith-packard commented 5 years ago

Oh, I just mis-understood IEEE 754. An IEEE float has three parts -- sign bit, exponent and mantissa. The sign bit is first, then the exponent then the mantissa. For a 32-bit value, there's 1 sign bit, 8 exponent bits and 23 mantissa bits. I thought the sign bit was between the exponent and the mantissa.

Infinities have all exponent bits set to 1 and the mantissa set to zero:

+inf = 0x7f800000 -inf = 0xff800000

NaNs have all exponent bits set to 1 and the mantissa set to non-zero, so anything other than +/- inf. The current snek code checks the sign bit and the top 7 exponent bits because I didn't understand the format. I think the right fix is to set both the sign bit and all 8 exponent bits to 1 when constructing a snek_poly_t from an offset. That reduces the space of snek offsets from 24 to 23 bits, which should be fine. If we want to get that 24th bit back, we can use the sign bit too. This has the added advantage that the only value with all top 9 bits set which is not an offset is -inf, which saves a compare.

I think this patch fixes the problem?

diff --git a/snek-poly.c b/snek-poly.c
index 81adb41..c1e083e 100644
--- a/snek-poly.c
+++ b/snek-poly.c
@@ -41,7 +41,7 @@ snek_float_to_poly(float f)
 static bool
 snek_is_float(snek_poly_t v)
 {
-       if ((v.u & 0xff000000) != 0xff000000 || v.u == SNEK_NAN_U || v.u == SNEK_NINF_U)
+       if ((v.u & 0xff800000) != 0xff800000 || v.u == SNEK_NINF_U)
                return true;
        return false;
 }
diff --git a/snek.h b/snek.h
index 95db108..6a0273f 100644
--- a/snek.h
+++ b/snek.h
@@ -752,7 +752,7 @@ snek_slice_identity(snek_slice_t *slice)
 static inline snek_poly_t
 snek_offset_to_poly(snek_offset_t offset, snek_type_t type)
 {
-       return (snek_poly_t) { .u = 0xff000000 | offset | type };
+       return (snek_poly_t) { .u = 0xff800000 | offset | type };
 }

 static inline snek_offset_t

keith-packard commented 5 years ago

I've cleaned up the patch to add comments, add a test and use symbolic constants instead of bare values. Thanks much for your tests; I've borrowed the values you used in the test case.

fanoush commented 5 years ago

Well, I am mostly concerned about this usage https://github.com/keith-packard/snek/blob/master/snek-poly.c#L50 From that code I assume that it is important for snek_is_float to always correctlly determine if the value is number or other type. So any floating point operation should not cause result that could be suddenly mismatched for different type. That's why I am not sure what it should return e.g. for infinity. If it should be true then "if ((v.u & 0xff800000) != 0xff800000" is not enough to distinct between 7f800000=inf true 7f800001=nan false because currently with infinity the snek_poly_type would mismatch it for list https://github.com/keith-packard/snek/blob/master/snek.h#L164

However even if infinity is fixed by returning true it can still happen with NaN values like the sqrtf(-1) giving 7fc00000 unless you fix each FP operation resulting in 'random' NaN value to some known NaN value that will not confuse snek_poly_type.

So I am really not sure what snek_is_float is supposed to return in such cases (+/-inf, +/-NaN).

fanoush commented 5 years ago

Oh, disregard this. I guess you handle both +-inf corectly and also the issue with NaN is already handled here https://github.com/keith-packard/snek/blob/master/snek-poly.c#L36

keith-packard commented 5 years ago

Thanks for verifying that it should work as expected. I'll leave this issue closed unless you find more problems.

fanoush commented 5 years ago

That reduces the space of snek offsets from 24 to 23 bits, which should be fine.

Just and idea, is zero offset a valid one? What happens when you store list type (lowest 2 bits =0) as first element at offset zero? Wouldn't this be encoded as -inf?

If we want to get that 24th bit back, we can use the sign bit too.

This belong more to the issue #5 So I will continue there, sorry for the noise.

keith-packard / snek

snek_is_float returns false for e.g. -1.701412e+38 #7