jkmcnk / sx-gcc

The GNU Compiler Collection port to NEC SX CPU architecture.
GNU General Public License v2.0
0 stars 2 forks source link

problem with inlining conditional code #96

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
testcase gcc.c-torture/execute/990326-1.c fails execution with -O3.

call to e4() fails.

from a brief inspection, it seems that the code for e4() is OK, however,
e4() code is inlined and additionally optimized in main(), and in there,
computation of the return value (comparisons + logical operators) is plain
wrong.

Original issue reported on code.google.com by jmoc...@gmail.com on 29 Jan 2009 at 9:34

GoogleCodeExporter commented 8 years ago
this is the minimal test case, makes it easier to inspect rtl and assembly:

----
struct e {
    int c:4, b:14, a:14;
} __attribute__ ((aligned));

int
e4()
{
    static struct e x = { -1, -1, 0 };

    return x.a == 0 && x.b & 0x2000;
}

int
main()
{
  if (!e4 ())
    abort ();
  exit (0);
}
----

Original comment by jmoc...@gmail.com on 30 Jan 2009 at 2:30

GoogleCodeExporter commented 8 years ago
it's the combiner that annoys us; the problematic change occurs when going from
125r.cse2 to 127r.combine, with the original sequence

----
(insn 20 17 21 3 (set (reg:DI 142)
        (ashift:DI (reg:DI 137)
            (const_int 4 [0x4]))) 27 {ashldi3} (nil)
    (nil))

(insn 21 20 22 3 (set (reg:DI 143)
        (ashiftrt:DI (reg:DI 142)
            (const_int 50 [0x32]))) 29 {ashrdi3} (nil)
    (nil))

(insn 22 21 23 3 (set (reg:SI 144)
        (sign_extend:SI (subreg:HI (reg:DI 143) 6))) 60 {extendhisi2} (nil)
    (nil))

(insn 23 22 24 3 (set (reg:SI 146)
        (const_int 8192 [0x2000])) 68 {movsi_general} (nil)
    (nil))

(insn 24 23 25 3 (set (reg:SI 145)
        (and:SI (reg:SI 144)
            (reg:SI 146))) 12 {andsi3} (nil)
    (expr_list:REG_EQUAL (and:SI (reg:SI 144)
            (const_int 8192 [0x2000]))
        (nil)))

(insn 25 24 26 3 (set (reg:CCS 147)
        (compare:CCS (reg:SI 145)
            (const_int 0 [0x0]))) 92 {comp_si} (nil)
    (nil))
----

transformed into

----
(insn 20 17 21 3 (set (reg:DI 142)
        (ashift:DI (reg:DI 137)
            (const_int 4 [0x4]))) 27 {ashldi3} (nil)
    (expr_list:REG_DEAD (reg:DI 137)
        (nil)))

(note 21 20 22 3 NOTE_INSN_DELETED)

(note 22 21 23 3 NOTE_INSN_DELETED)

(note 23 22 24 3 NOTE_INSN_DELETED)

(note 24 23 25 3 NOTE_INSN_DELETED)

(insn 25 24 26 3 (set (reg:CCS 147)
        (compare:CCS (reg:DI 142)
            (const_int 0 [0x0]))) 93 {comp_di} (insn_list:REG_DEP_TRUE 20 (nil))
    (expr_list:REG_DEAD (reg:DI 142)
        (nil)))
----

with semantics clearly not equivalent. now let's see what makes our little 
compiler
think it can erase insns all over ...

Original comment by jmoc...@gmail.com on 30 Jan 2009 at 2:47

GoogleCodeExporter commented 8 years ago
actually this happens after life analysis, from 126r.life1 to 127r.combine,
126r-life1 still contains later erased insns:

----
(insn 20 17 21 3 (set (reg:DI 142)
        (ashift:DI (reg:DI 137)
            (const_int 4 [0x4]))) 27 {ashldi3} (nil)
    (expr_list:REG_DEAD (reg:DI 137)
        (nil)))

(insn 21 20 22 3 (set (reg:DI 143)
        (ashiftrt:DI (reg:DI 142)
            (const_int 50 [0x32]))) 29 {ashrdi3} (insn_list:REG_DEP_TRUE 20 (nil))
    (expr_list:REG_DEAD (reg:DI 142)
        (nil)))

(insn 22 21 23 3 (set (reg:SI 144)
        (sign_extend:SI (subreg:HI (reg:DI 143) 6))) 60 {extendhisi2}
(insn_list:REG_DEP_TRUE 21 (nil))
    (expr_list:REG_DEAD (reg:DI 143)
        (nil)))

(insn 23 22 24 3 (set (reg:SI 146)
        (const_int 8192 [0x2000])) 68 {movsi_general} (nil)
    (nil))

(insn 24 23 25 3 (set (reg:SI 145)
        (and:SI (reg:SI 144)
            (reg:SI 146))) 12 {andsi3} (insn_list:REG_DEP_TRUE 22
(insn_list:REG_DEP_TRUE 23 (nil)))
    (expr_list:REG_DEAD (reg:SI 144)
        (expr_list:REG_DEAD (reg:SI 146)
            (expr_list:REG_EQUAL (and:SI (reg:SI 144)
                    (const_int 8192 [0x2000]))
                (nil)))))

(insn 25 24 26 3 (set (reg:CCS 147)
        (compare:CCS (reg:SI 145)
            (const_int 0 [0x0]))) 92 {comp_si} (insn_list:REG_DEP_TRUE 24 (nil))
    (expr_list:REG_DEAD (reg:SI 145)
        (nil)))
----

Original comment by jmoc...@gmail.com on 30 Jan 2009 at 3:15

GoogleCodeExporter commented 8 years ago
ok, the cause of this seems to be gcc ignoring the fact that comp_branch_si 
can't
operate on results of comp_di compares

namely, the final insn sequence:

----
#(insn 20 17 21 (set (reg:DI 35 s35 [142])
#        (ashift:DI (reg:DI 36 s36 [137])
#            (const_int 4 [0x4]))) 27 {ashldi3} (nil)
#    (expr_list:REG_DEAD (reg:DI 36 s36 [137])
#        (nil)))
    slax    $s35,$s36,4 # 20    ashldi3/2   [length = 1]
#(insn 25 24 26 (set (reg:CCS 35 s35 [147])
#        (compare:CCS (reg:DI 35 s35 [142])
#            (const_int 0 [0x0]))) 93 {comp_di} (insn_list:REG_DEP_TRUE 20 
(nil))
#    (nil))
    cpx $s35,$s35,(0)1  # comp_DI, =r,r,M   # 25    comp_di/3   [length = 1]
#(jump_insn 26 25 35 (parallel [
#            (set (pc)
#                (if_then_else (lt:SI (reg:CCS 35 s35 [147])
#                        (const_int 0 [0x0]))
#                    (label_ref:DI 70)
#                    (pc)))
#            (use (reg:DI 4 s4))
#        ]) 97 {cond_branch_si} (insn_list:REG_DEP_TRUE 25 (nil))
#    (expr_list:REG_DEAD (reg:CCS 35 s35 [147])
#        (expr_list:REG_BR_PROB (const_int 100 [0x64])
#            (nil))))
    bls $s35,.L15   # 26    cond_branch_si  [length = 1]
----

would work just fine (and is in fact a very clever choice), if cond_branch_di 
were
used instead of original cond_branch_si, stemming from the initial expand ...

the way this rtl sequence emerges makes me think there is no other way to fix 
this
but to add distinct CC modes for SI and DI (and probably distinct ones for SF 
and DF
would be smart as well) operands. and once again we end up back at the
compare-and-branch nightmare ... ;)

suits me just right for being lazy the last time ... sigh ...

Original comment by jmoc...@gmail.com on 3 Feb 2009 at 2:18

GoogleCodeExporter commented 8 years ago
indeed, distinct SI/DI CC modes (introduced in r217) fix this.

Original comment by jmoc...@gmail.com on 3 Feb 2009 at 2:42