riscvarchive / riscv-qemu

QEMU with RISC-V (RV64G, RV32G) Emulation Support
385 stars 154 forks source link

Floating-point exception flags not set correctly? #64

Closed aswaterman closed 6 years ago

aswaterman commented 7 years ago

This gfortran test in the GCC test suite fails on QEMU but passes on Spike: https://github.com/riscv/riscv-gcc/blob/riscv-gcc-7/gcc/testsuite/gfortran.dg/ieee/ieee_1.F90#L58

It seems likely that it's a problem with setting the floating-point exception flags in fflags/fcsr, because the test is expecting the Overflow/Inexact to be set, but the program reports that no flags are set.

bkoppelmann commented 7 years ago

Do you know the offending instruction from the top of your head?

aswaterman commented 7 years ago

Sorry, I didn't have time to look into it to that extent. Looking at the code, it appears to be multiplying two large numbers to generate the overflow flag, so probably fmul.d?

jim-wilson commented 6 years ago

The problem here is that more than one exception flag can be set at a time, but softfloat_flags_to_riscv() in target-riscv/fpu-helper.c works correctly only when one flag is set at a time. If I add case float_flag_overflow | float_flag_inexact: return 5; I can make some gcc testsuite failures with FP testcases go away. This may not be the best approach though, as then we have to handle every possible combination of exception flags, and it might take some time to discover them all. We could start by making the default case call abort, and then start adding missing case values one at a time as we find qemu aborts. An alternative might be to just test each float_flag_X value one at a time, and accumulate the result in a local variable. This would be slower than a switch statement though, which may be undesirable.

michaeljclark commented 6 years ago

Do we want to do something like this?

diff --git a/target-riscv/fpu_helper.c b/target-riscv/fpu_helper.c
index b4eff27..04d889f 100644
--- a/target-riscv/fpu_helper.c
+++ b/target-riscv/fpu_helper.c
@@ -53,22 +53,15 @@ ieee_rm[rm]; })
 #endif

 /* convert softfloat library flag numbers to RISC-V */
-unsigned int softfloat_flags_to_riscv(unsigned int flag)
-{
-    switch (flag) {
-    case float_flag_inexact:
-        return 1;
-    case float_flag_underflow:
-        return 2;
-    case float_flag_overflow:
-        return 4;
-    case float_flag_divbyzero:
-        return 8;
-    case float_flag_invalid:
-        return 16;
-    default:
-        return 0;
-    }
+unsigned int softfloat_flags_to_riscv(unsigned int flags)
+{
+    int rv_flags = 0;
+    if ( flags & float_flag_inexact   ) rv_flags |= 1;
+    if ( flags & float_flag_underflow ) rv_flags |= 2;
+    if ( flags & float_flag_overflow  ) rv_flags |= 4;
+    if ( flags & float_flag_divbyzero ) rv_flags |= 8;
+    if ( flags & float_flag_invalid   ) rv_flags |= 16;
+    return rv_flags;
 }

 /* adapted from Spike's decode.h:set_fp_exceptions */
michaeljclark commented 6 years ago

@jim-wilson I think accuracy is more important than performance. It's likely noise compared to the speed of the softfloat routines.

Alternatively, we could make a 256 entry table, which would require one indexed load. i.e. loop through from 0 to 255 and emit an array. Here's the softfloat flags:

enum {
    float_flag_invalid   =  1,
    float_flag_divbyzero =  4,
    float_flag_overflow  =  8,
    float_flag_underflow = 16,
    float_flag_inexact   = 32,
    float_flag_input_denormal = 64,
    float_flag_output_denormal = 128
};

If you try the patch, let me know and I'll make a PR.

jim-wilson commented 6 years ago

Only 5 of the flags are being used, so there are only 31 non-zero combinations. We could just list them all in the case statement, but that is not very elegant, and not very maintainable. I'm not sure that it is safe to make assumptions about the values of the float_flag_X enums. A 256 entry table is safe only if the values will never increase. I'm fine with having 5 if statements. I will try a gcc testsuite run with that patch.

michaeljclark commented 6 years ago

Yep. A switch statement would probably end up as a 64 entry jump table (given bits 0,2,3,4,5 are used) so would probably be a bit faster. The 5 statement version is easiest to read and maintain.

This version generates slightly nicer code than the 5 if statements (no branches or cmovs): https://cx.rv8.io/g/s8QrXx

unsigned int softfloat_flags_to_riscv(unsigned int flags)
{
    int rv_flags = 0;
    rv_flags |= ( flags & float_flag_inexact   ) ? 1 : 0;
    rv_flags |= ( flags & float_flag_underflow ) ? 2 : 0;
    rv_flags |= ( flags & float_flag_overflow  ) ? 4 : 0;
    rv_flags |= ( flags & float_flag_divbyzero ) ? 8 : 0;
    rv_flags |= ( flags & float_flag_invalid   ) ? 16 : 0;
    return rv_flags;
}

BTW we could use the constants in the table using the array index syntax. This syntax is used quite heavily inside of QEMU. This compiles and we also don't specify the array size i.e. it is deduced, and i've masked the table lookup (which will constant fold) to only use our 5 flags:

static int softfloat_exception_map[] = {
    [float_flag_inexact] = 1,
    [float_flag_underflow] = 2,
    [float_flag_inexact|float_flag_underflow] = 3,
    [float_flag_overflow] = 4,
    [float_flag_inexact|float_flag_overflow] = 5,
    [float_flag_underflow|float_flag_overflow] = 6,
    [float_flag_inexact|float_flag_underflow|float_flag_overflow] = 7,
    [float_flag_divbyzero] = 8,
    /* on... */
    [float_flag_invalid] = 16,
    /* and on... */
};

/* convert softfloat library flag numbers to RISC-V */
unsigned int softfloat_flags_to_riscv(unsigned int flags)
{
    return softfloat_exception_map[flags & (float_flag_inexact|float_flag_underflow|
        float_flag_overflow|float_flag_inexact|float_flag_divbyzero|float_flag_invalid)];
}
michaeljclark commented 6 years ago

ends up as this:

        and     edi, 61
        lea     rax, [rip + _softfloat_exception_map]
        mov     eax, dword ptr [rax + 4*rdi]
jim-wilson commented 6 years ago

With the 5 if-statment version of the patch, running the gcc testsuites for a linux rv64gc/lp64d target, the gcc unexpected failures drop from 14 to 3, and the gfortran unexpected failures drop from 32 to 20. There is no change to the g++ results.

The other suggested solutions are equivalent, and I would expect the same results.

michaeljclark commented 6 years ago

I've gone with the 5 statement version.

It's on the riscv-next branch: https://github.com/riscv/riscv-qemu/tree/riscv-next