Closed aswaterman closed 6 years ago
Do you know the offending instruction from the top of your head?
Sorry, I didn't have time to look into it to that extent. Looking at the code, it appears to be multiplying two large numbers to generate the overflow flag, so probably fmul.d?
The problem here is that more than one exception flag can be set at a time, but softfloat_flags_to_riscv() in target-riscv/fpu-helper.c works correctly only when one flag is set at a time. If I add case float_flag_overflow | float_flag_inexact: return 5; I can make some gcc testsuite failures with FP testcases go away. This may not be the best approach though, as then we have to handle every possible combination of exception flags, and it might take some time to discover them all. We could start by making the default case call abort, and then start adding missing case values one at a time as we find qemu aborts. An alternative might be to just test each float_flag_X value one at a time, and accumulate the result in a local variable. This would be slower than a switch statement though, which may be undesirable.
Do we want to do something like this?
diff --git a/target-riscv/fpu_helper.c b/target-riscv/fpu_helper.c
index b4eff27..04d889f 100644
--- a/target-riscv/fpu_helper.c
+++ b/target-riscv/fpu_helper.c
@@ -53,22 +53,15 @@ ieee_rm[rm]; })
#endif
/* convert softfloat library flag numbers to RISC-V */
-unsigned int softfloat_flags_to_riscv(unsigned int flag)
-{
- switch (flag) {
- case float_flag_inexact:
- return 1;
- case float_flag_underflow:
- return 2;
- case float_flag_overflow:
- return 4;
- case float_flag_divbyzero:
- return 8;
- case float_flag_invalid:
- return 16;
- default:
- return 0;
- }
+unsigned int softfloat_flags_to_riscv(unsigned int flags)
+{
+ int rv_flags = 0;
+ if ( flags & float_flag_inexact ) rv_flags |= 1;
+ if ( flags & float_flag_underflow ) rv_flags |= 2;
+ if ( flags & float_flag_overflow ) rv_flags |= 4;
+ if ( flags & float_flag_divbyzero ) rv_flags |= 8;
+ if ( flags & float_flag_invalid ) rv_flags |= 16;
+ return rv_flags;
}
/* adapted from Spike's decode.h:set_fp_exceptions */
@jim-wilson I think accuracy is more important than performance. It's likely noise compared to the speed of the softfloat routines.
Alternatively, we could make a 256 entry table, which would require one indexed load. i.e. loop through from 0 to 255 and emit an array. Here's the softfloat flags:
enum {
float_flag_invalid = 1,
float_flag_divbyzero = 4,
float_flag_overflow = 8,
float_flag_underflow = 16,
float_flag_inexact = 32,
float_flag_input_denormal = 64,
float_flag_output_denormal = 128
};
If you try the patch, let me know and I'll make a PR.
Only 5 of the flags are being used, so there are only 31 non-zero combinations. We could just list them all in the case statement, but that is not very elegant, and not very maintainable. I'm not sure that it is safe to make assumptions about the values of the float_flag_X enums. A 256 entry table is safe only if the values will never increase. I'm fine with having 5 if statements. I will try a gcc testsuite run with that patch.
Yep. A switch statement would probably end up as a 64 entry jump table (given bits 0,2,3,4,5 are used) so would probably be a bit faster. The 5 statement version is easiest to read and maintain.
This version generates slightly nicer code than the 5 if statements (no branches or cmovs): https://cx.rv8.io/g/s8QrXx
unsigned int softfloat_flags_to_riscv(unsigned int flags)
{
int rv_flags = 0;
rv_flags |= ( flags & float_flag_inexact ) ? 1 : 0;
rv_flags |= ( flags & float_flag_underflow ) ? 2 : 0;
rv_flags |= ( flags & float_flag_overflow ) ? 4 : 0;
rv_flags |= ( flags & float_flag_divbyzero ) ? 8 : 0;
rv_flags |= ( flags & float_flag_invalid ) ? 16 : 0;
return rv_flags;
}
BTW we could use the constants in the table using the array index syntax. This syntax is used quite heavily inside of QEMU. This compiles and we also don't specify the array size i.e. it is deduced, and i've masked the table lookup (which will constant fold) to only use our 5 flags:
static int softfloat_exception_map[] = {
[float_flag_inexact] = 1,
[float_flag_underflow] = 2,
[float_flag_inexact|float_flag_underflow] = 3,
[float_flag_overflow] = 4,
[float_flag_inexact|float_flag_overflow] = 5,
[float_flag_underflow|float_flag_overflow] = 6,
[float_flag_inexact|float_flag_underflow|float_flag_overflow] = 7,
[float_flag_divbyzero] = 8,
/* on... */
[float_flag_invalid] = 16,
/* and on... */
};
/* convert softfloat library flag numbers to RISC-V */
unsigned int softfloat_flags_to_riscv(unsigned int flags)
{
return softfloat_exception_map[flags & (float_flag_inexact|float_flag_underflow|
float_flag_overflow|float_flag_inexact|float_flag_divbyzero|float_flag_invalid)];
}
ends up as this:
and edi, 61
lea rax, [rip + _softfloat_exception_map]
mov eax, dword ptr [rax + 4*rdi]
With the 5 if-statment version of the patch, running the gcc testsuites for a linux rv64gc/lp64d target, the gcc unexpected failures drop from 14 to 3, and the gfortran unexpected failures drop from 32 to 20. There is no change to the g++ results.
The other suggested solutions are equivalent, and I would expect the same results.
I've gone with the 5 statement version.
It's on the riscv-next
branch: https://github.com/riscv/riscv-qemu/tree/riscv-next
This gfortran test in the GCC test suite fails on QEMU but passes on Spike: https://github.com/riscv/riscv-gcc/blob/riscv-gcc-7/gcc/testsuite/gfortran.dg/ieee/ieee_1.F90#L58
It seems likely that it's a problem with setting the floating-point exception flags in
fflags
/fcsr
, because the test is expecting the Overflow/Inexact to be set, but the program reports that no flags are set.