Closed GoogleCodeExporter closed 8 years ago
__udivmoddi4 takes the incomming UDWtype and creates UWtype:s from the data.
After some different steps it calls the function __udiv_qrnnd_c which has the
following
code :
UWtype __d1, __d0, __q1, __q0;
__r1 = (n1) % __d1;
It seems that this mod generates a umoddi3 call although modsi3 would be the
proper
call ? UNITS_PER_WORD and MIN_UNITS_PER_WORD defined in sx.h
plays a role in this.
longlong.h is important :
UWtype -- An unsigned type, default type for operations (typically a "word")
UHWtype -- An unsigned type, at least half the size of UWtype.
UDWtype -- An unsigned type, at least twice as large a UWtype
W_TYPE_SIZE -- size in bits of UWtype
UQItype -- Unsigned 8 bit type.
SItype, USItype -- Signed and unsigned 32 bit types.
DItype, UDItype -- Signed and unsigned 64 bit types.
On a 32 bit machine UWtype should typically be USItype;
on a 64 bit machine, UWtype should typically be UDItype. */
Original comment by fred.tre...@googlemail.com
on 12 Nov 2008 at 10:40
Inside __udiv_qrnnd_c that is a #define macro there is a mod of type UWtype.
printf("mod 1 %d %d %d\n",sizeof(__r1),sizeof((n1)),sizeof(__d1));
__r1 = (n1) % __d1;
printf("mod 1 - Done\n");
generates this output :
udiv_qrnnd call
mod 1 4 4 4
umoddi3 ... <- umoddi called instead of umodsi
The expansion of the routine looks like this :
;; D.5259 = n1 % __d1
(insn 203 202 204 (set (reg:DI 341)
(zero_extend:DI (mem/c/i:SI (plus:DI (reg/f:DI 129 virtual-stack-vars)
(const_int 12 [0xc])) [0 n1+0 S4 A32]))) -1 (nil)
(nil))
(insn 204 203 207 (set (reg:DI 342)
(zero_extend:DI (mem/c/i:SI (plus:DI (reg/f:DI 129 virtual-stack-vars)
(const_int 44 [0x2c])) [0 __d1+0 S4 A32]))) -1 (nil)
(nil))
(insn 207 204 205 (set (reg/f:DI 347)
(mem/u/c/i:DI (symbol_ref/u:DI (".LC13") [flags 0x2]) [0 S8 A64])) -1 (nil)
(expr_list:REG_EQUAL (symbol_ref:DI ("__umoddi3") [flags 0x41])
(nil)))
.....
I tried to implementing umodsi3 in sx.md as :
(define_insn "umodsi3"
[(set (match_operand:SI 0 "register_operand" "=r")
(umod:SI (match_operand:SI 1 "register_operand" "r")
(match_operand:SI 2 "register_operand" "r")))
(clobber (match_scratch:DF 3 "=&r"))
(clobber (match_scratch:DF 4 "=&r"))
(clobber (match_scratch:SI 5 "=&r"))]
""
"flt\\t%3,%1\\n\\
\\tflt\\t%4,%2\\n\\
\\tfdv\\t%3,%3,%4\\n\\
\\tfix\\t%5,%3,1\\n\\
\\tmps\\t%5,%5,%2\\n\\
\\tsbs\\t%0,%1,%5"
[(set_attr "type" "fp")
(set_attr "mode" "SI")
(set_attr "length" "6")])
But that does not get selected when expanding SItype = SItype % SItype.
Continuing to investigate why..
Original comment by fred.tre...@googlemail.com
on 12 Nov 2008 at 1:51
Added divsi3 to sx.md via define_expand. Also missing modsi3 and unsigned
version.
They are not mathematically correct at the moment but should "work" good enough.
gen_fix_truncdfdi seems to create problems when compiling some applications,
though
not gcc itself. We need to find out what construct fails.
The failure sais :
internal compiler error: in trunc_int_for_mode, at explow.c:55
It fails in an assert :
/* You want to truncate to a _what_? */
gcc_assert (SCALAR_INT_MODE_P (mode));
Still the testcase does not pass, it fails in an illegal instruction.
But the insertion must have worked somehow as there are no longer a recursion.
More investigation will be made.
The current addition to the md file is : remove divdi3 and add :
(define_expand "divsi3"
[(set (match_operand:SI 0 "register_operand" "=r")
(div:SI (match_operand:SI 1 "register_operand" "r")
(match_operand:SI 2 "register_operand" "r")))]
""
{
rtx op1_df, op2_df, op0_df, op0_di;
op0_df = gen_reg_rtx (DFmode);
op0_di = gen_reg_rtx (DImode);
op1_df = gen_reg_rtx (DFmode);
expand_float (op1_df, operands[1], 0);
op2_df = gen_reg_rtx (DFmode);
expand_float (op2_df, operands[2], 0);
emit_insn (gen_divdf3 (op0_df, op1_df, op2_df));
emit_insn (gen_fix_truncdfdi2 (op0_di, op0_df));
emit_move_insn (operands[0], gen_lowpart (SImode, op0_di));
DONE;
})
(define_expand "udivsi3"
[(set (match_operand:SI 0 "register_operand" "=r")
(udiv:SI (match_operand:SI 1 "register_operand" "r")
(match_operand:SI 2 "register_operand" "r")))]
""
{
emit_insn (gen_divsi3 (operands[0], operands[1], operands[2]));
DONE;
})
(define_expand "modsi3"
[(set (match_operand:SI 0 "register_operand" "=r")
(mod:SI (match_operand:SI 1 "register_operand" "r")
(match_operand:SI 2 "register_operand" "r")))]
""
{
rtx div_floor, prod;
div_floor = gen_reg_rtx (SImode);
prod = gen_reg_rtx (SImode);
emit_insn (gen_divsi3 (div_floor, operands[1], operands[2]));
emit_insn (gen_mulsi3 (prod, operands[2], div_floor));
emit_insn (gen_subsi3 (operands[0], operands[1], prod));
DONE;
})
(define_expand "umodsi3"
[(set (match_operand:SI 0 "register_operand" "=r")
(umod:SI (match_operand:SI 1 "register_operand" "r")
(match_operand:SI 2 "register_operand" "r")))]
""
{
emit_insn (gen_modsi3 (operands[0], operands[1], operands[2]));
DONE;
})
The code probably breaks the build for many tests so it will not be checked in
before properly adapted and tested.
But ideas are welcome. The divsi3 example comes from ia64.md which has other
constructs as well. Might need more define_insn:s.
Original comment by fred.tre...@googlemail.com
on 18 Nov 2008 at 5:56
The current md is generating the following insn:s as oppose to the above that
was
generated before (in divdi3.o)
It should be correct, but we need to check the calculations more detailed.
The fixpoint conversion should actually be floor to make sure also negative
numbers
work..
;; D.3698 = n1 % __d1
(insn 172 171 173 (set (reg:SI 324)
(mem/c/i:SI (plus:DI (reg/f:DI 129 virtual-stack-vars)
(const_int 12 [0xc])) [0 n1+0 S4 A32])) -1 (nil)
(nil))
(insn 173 172 174 (set (reg:SI 325)
(mem/c/i:SI (plus:DI (reg/f:DI 129 virtual-stack-vars)
(const_int 44 [0x2c])) [0 __d1+0 S4 A32])) -1 (nil)
(nil))
(insn 174 173 175 (set (reg:DF 330)
(float:DF (reg:SI 324))) -1 (nil)
(nil))
(insn 175 174 176 (set (reg:DF 331)
(float:DF (reg:SI 325))) -1 (nil)
(nil))
(insn 176 175 177 (set (reg:DF 328)
(div:DF (reg:DF 330)
(reg:DF 331))) -1 (nil)
(nil))
(insn 177 176 178 (set (reg:DI 329)
(fix:DI (reg:DF 328))) -1 (nil)
(nil))
(insn 178 177 179 (set (reg:SI 326)
(subreg:SI (reg:DI 329) 4)) -1 (nil)
(nil))
(insn 179 178 180 (set (reg:SI 327)
(mult:SI (reg:SI 325)
(reg:SI 326))) -1 (nil)
(nil))
(insn 180 179 0 (set (reg:SI 157 [ D.3698 ])
(minus:SI (reg:SI 324)
(reg:SI 327))) -1 (nil)
(expr_list:REG_EQUAL (umod:SI (reg:SI 324)
(reg:SI 325))
(nil)))
Original comment by fred.tre...@googlemail.com
on 18 Nov 2008 at 8:50
[deleted comment]
With the added sx.md routines for the divsi3 and the fixes in FP rounding the
testcase now passes.
RUNTESTFLAGS="--target_board=sx6i execute.exp=20000402-1.c" make check
=== gcc Summary ===
# of expected passes 12
PASS: gcc.c-torture/execute/20000402-1.c compilation, -O0
PASS: gcc.c-torture/execute/20000402-1.c execution, -O0
PASS: gcc.c-torture/execute/20000402-1.c compilation, -O1
PASS: gcc.c-torture/execute/20000402-1.c execution, -O1
PASS: gcc.c-torture/execute/20000402-1.c compilation, -O2
PASS: gcc.c-torture/execute/20000402-1.c execution, -O2
PASS: gcc.c-torture/execute/20000402-1.c compilation, -O3 -fomit-frame-pointer
PASS: gcc.c-torture/execute/20000402-1.c execution, -O3 -fomit-frame-pointer
PASS: gcc.c-torture/execute/20000402-1.c compilation, -O3 -g
PASS: gcc.c-torture/execute/20000402-1.c execution, -O3 -g
PASS: gcc.c-torture/execute/20000402-1.c compilation, -Os
PASS: gcc.c-torture/execute/20000402-1.c execution, -Os
Will add a FIXME label to make sure that the divsi and modsi are acting
mathematically correct.
Original comment by fred.tre...@googlemail.com
on 19 Nov 2008 at 12:17
Original issue reported on code.google.com by
fred.tre...@googlemail.com
on 29 Oct 2008 at 4:11