regehr commented 15 years ago


Bugzilla Link	3149
Resolution	FIXED
Resolved on	Dec 19, 2008 15:00
Version	unspecified
OS	Linux
CC	@asl,@lattner,@isanbard

Extended Description

This one is interesting because llvm-gcc's output is wrong at all optimization levels (-O0 through -O3).

This is a regression since 2.3.

Seen on r60325 on Ubuntu Hardy on x86.

regehr@john-home:~/volatile/tmp71$ llvm-gcc -O small.c -o small ; ./small -1 regehr@john-home:~/volatile/tmp71$ llvm-gcc-2.4 -O small.c -o small ; ./small -1 regehr@john-home:~/volatile/tmp71$ llvm-gcc-2.3 -O small.c -o small ; ./small -10347 regehr@john-home:~/volatile/tmp71$ llvm-gcc-2.2 -O small.c -o small ; ./small -10347 regehr@john-home:~/volatile/tmp71$ icc -O small.c -o small ; ./small -10347 regehr@john-home:~/volatile/tmp71$ gcc -O small.c -o small ; ./small -10347 regehr@john-home:~/volatile/tmp71$ cat small.c

include

static inline int16_t safe_mul_int16_t_s_s (int16_t si1, int16_t si2) {
if (si1 > 0){ / si1 is positive /
if (si2 > 0) { / si1 and si2 are positive / if (si1 > (INT16_MAX / si2)) {
return si1;
}
} / end if si1 and si2 are positive /
else { / si1 positive, si2 non-positive / if (si2 < (INT16_MIN / si1)) {
return si1;
}
} / si1 positive, si2 non-positive /
} / end if si1 is positive /
else { / si1 is non-positive /
if (si2 > 0) { / si1 is non-positive, si2 is positive /
if (si1 < (INT16_MIN / si2)) {
return si1;
}
} / end if si1 is non-positive, si2 is positive / else { / si1 and si2 are non-positive /
if ( (si1 != 0) && (si2 < (INT16_MAX / si1))) { return si1;
}
} / end if si1 and si2 are non-positive / } / end if si1 is non-positive /
return si1 * si2;
}

static inline uint64_t
safe_mod_uint64_t_u_u (uint64_t ui1, uint64_t ui2)
{
if (ui2 == 0) return ui1;
return ui1 % ui2;
}

int32_t g_355 = 0xC5E0286AL; int32_t g_379 = -1L; uint32_t g_450;

int32_t func_3 (uint32_t p_4); int32_t func_3 (uint32_t p_4) { g_450 = safe_mul_int16_t_s_s (p_4, 1); return 1; }

int32_t func_1 (void); int32_t func_1 (void) { func_3 (safe_mod_uint64_t_u_u (g_379, g_355)); return 1; }

int main (void) { func_1 (); printf ("%d\n", g_450); return 0; }

llvmbot commented 15 years ago

Fixed. http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20081215/071372.html

llvmbot commented 15 years ago

Interesting problem:

172     %ECX<def> = MOV32rr %reg1039<kill>
180     INLINEASM <es:subl $5,$1
        sbbl $3,$0>, 10, %EAX<def>, 14, %ECX<earlyclobber,def>, 9, %EAX<kill>, 36, <fi#0>, 1, %reg0, 0, 9, %ECX<kill>, 36, <fi#1>, 1, %reg0, 0
188     %EAX<def> = MOV32rr %EAX<kill>
196     %ECX<def> = MOV32rr %ECX<kill>
204     %ECX<def> = MOV32rr %ECX<kill>
212     %EAX<def> = MOV32rr %EAX<kill>
220     %EAX<def> = MOV32rr %EAX
228     %reg1039<def> = MOV32rr %ECX<kill>

The early clobber operand ties ECX input to the ECX def.

The live interval of ECX is represented as this: %reg20,inf = [46,47:1)[174,230:0) 0@174-(230) 1@46-(47)

It does not specify there was a definition at 181.

SimpleRegisterCoalescing::AdjustCopiesBackFrom() thinks it's perfectly ok to eliminate the copy at 228 since it's defined by a copy from reg1039 at 172.

llvmbot commented 15 years ago

This is a coalescer bug.

llvmbot commented 15 years ago

I don't think this is a codegen issue. This is a bug in the processor. It should have understood what codegen is really trying to do. :-)

I'll deal with this.

lattner commented 15 years ago

testcase for the above Repro the above with llc t.bc -o -

lattner commented 15 years ago

Here's my last testcase:

#include <stdio.h>
int umoddi3 (unsigned long long u, unsigned long long v) __attribute__((noinline));
int umoddi3 (unsigned long long u, unsigned long long v) {
  unsigned int d0, d1, n0, n1, n2;

  d0 = v;
  d1 = v >> 32;
  n0 = u;
  n1 = u >> 32;
  printf("n0=%d\n", n0);

  if (n0 >= d0)
  {
    __asm__ ("subl %5,%1\n\tsbbl %3,%0"
             : "=r" (n1), "=&r" (n0)
             : "0" (n1), "g" (d1), "1" (n0), "g" (d0));
  }

  return n0+n1;
}

volatile unsigned long long A = -1, B = -975165334; 

int main (void) {
  printf ("result = %d\n", umoddi3(A,B));
  return 0;
}

At -O1, it is producing this output:

    ## InlineAsm Start
    subl -12(%ebp),%ecx
    sbbl -8(%ebp),%eax
    ## InlineAsm End
LBB1_2: ## bb2
    addl    %esi, %eax
    addl    $20, %esp
    popl    %esi
    popl    %ebp
    ret

Note that the result of the asm is in ecx/eax, but that the epilog block adds esi/eax. The code coming out of isel looks fine:

        INLINEASM <es:subl $5,$1
        sbbl $3,$0>, 10, %EAX<def>, 14, %ECX<earlyclobber,def>, 9, %EAX, 36, <fi#0>, 1, %reg0, 0, 9, %ECX, 36, <fi#1>, 1, %reg0, 0
        %reg1035<def> = MOV32rr %EAX
        %reg1036<def> = MOV32rr %ECX
        %reg1030<def> = MOV32rr %reg1036
        %reg1029<def> = MOV32rr %reg1035
    Successors according to CFG: 0x1818008 (#3)

bb2: 0x1818008, LLVM BB @&#8203;0x15028a0, ID#3:
    Predecessors according to CFG: 0x1817f28 (#1) 0x1817f98 (#2)
        %reg1031<def> = PHI %reg1028, mbb<entry.bb2_crit_edge,0x1817f28>, %reg1029, mbb<bb,0x1817f98>
        %reg1032<def> = PHI %reg1027, mbb<entry.bb2_crit_edge,0x1817f28>, %reg1030, mbb<bb,0x1817f98>
        %reg1037<def> = ADD32rr %reg1032, %reg1031, %EFLAGS<imp-def>

so this is a phi elim or coalescer issue. I'm going to let someone else take this from here.

lattner commented 15 years ago

Here's my currently reduced testcase (works at -O0 fails at -O1+):

include

unsigned long long umoddi3 (unsigned long long u, unsigned long long v) attribute((noinline));

unsigned long long umoddi3 (unsigned long long u, unsigned long long v) { unsigned int d0, d1, n0, n1, n2;

d0 = v; d1 = v >> 32; n0 = u; n1 = u >> 32; printf("d0=%d d1=%d n0=%d n1=%d\n", d0, d1, n0, n1);

if (n0 >= d0) { asm ("subl %5,%1\n\tsbbl %3,%0" : "=r" (n1), "=&r" (n0) : "0" (n1), "g" (d1), "1" (n0), "g" (d0)); }

return n0 | ((long long)n1 << 32); }

volatile unsigned long long A = -1, B = -975165334;

int main (void) { printf ("%d\n", (int)(umoddi3(A,B))); return 0; }

llvmbot commented 15 years ago

Did I say that it was a bug in mem2reg? I was merely pointing out that what I saw. Draw your own conclusions as to where the bug is.

lattner commented 15 years ago

btw, the mem2reg'd code works with llc -fast but fails with llc. I'm positive it is not mem2reg's fault.

lattner commented 15 years ago

for the zillionth time, this is almost certainly not a bug in mem2reg or sroa. This is a bug in llc that is exposed by them. I'm actively reducing a testcase.

llvmbot commented 15 years ago

I pulled hte code for _umoddi3 into the .c file to test (so the test itself passes with llvm-gcc -O0 and fails with -O3) and bugpoint reduced it. It blames scalarrepl, but I am assuming that this is a llc bug. Continuing investigation.

Then it's probably the SROA problem that I mentioned above. I forced SROA to stop after performing a call to promote "mem2reg" and it crashed.

lattner commented 15 years ago

I pulled hte code for _umoddi3 into the .c file to test (so the test itself passes with llvm-gcc -O0 and fails with -O3) and bugpoint reduced it. It blames scalarrepl, but I am assuming that this is a llc bug. Continuing investigation.

lattner commented 15 years ago

Reduced driver:

include

volatile unsigned long long A = -1, B = -975165334; int main (void) { printf ("%d\n", (int)(A%B)); return 0; }

lattner commented 15 years ago

I can repro this, it is definitely _umoddi3

llvmbot commented 15 years ago

It seems to be an asm problem. This testcase gives the right result if compiled with llc, but when optimized with -inline and then compiled with llc the result changes. To get this testcase I extracted each asm into it's own function then optimized what was left with -std-compile-opts. Using -inline puts the asm statements back where they were originally - and triggers the bug.

If you add "alwaysinline" to the @subl function and "noinline" to the other inline asm functions, then this is triggered. All other combinations pass.

I mean that if @subl isn't inlined, then the test passes no matter what function attribute is added to the other two functions.

llvmbot commented 15 years ago

It seems to be an asm problem. This testcase gives the right result if compiled with llc, but when optimized with -inline and then compiled with llc the result changes. To get this testcase I extracted each asm into it's own function then optimized what was left with -std-compile-opts. Using -inline puts the asm statements back where they were originally - and triggers the bug.

If you add "alwaysinline" to the @subl function and "noinline" to the other inline asm functions, then this is triggered. All other combinations pass.

llvmbot commented 15 years ago

This might be due to the recent bootstrap breakage that has been plaguing us this week. I tracked it down to libgcc being mis-compiled. I'll try compiling libgcc with -O1 (and with load PRE turned on) and seeing if that works.

And by "might be due to" I mean "might be the cause of", of course.

llvmbot commented 15 years ago

This might be due to the recent bootstrap breakage that has been plaguing us this week. I tracked it down to libgcc being mis-compiled. I'll try compiling libgcc with -O1 (and with load PRE turned on) and seeing if that works.

llvmbot commented 15 years ago

Adding Chris to the CC list because it seems to be an asm problem.

llvmbot commented 15 years ago

Much reduced testcase It seems to be an asm problem. This testcase gives the right result if compiled with llc, but when optimized with -inline and then compiled with llc the result changes. To get this testcase I extracted each asm into it's own function then optimized what was left with -std-compile-opts. Using -inline puts the asm statements back where they were originally - and triggers the bug.

isanbard commented 15 years ago

It might be a codegen bug that requires variables to be in registers to fire. This would explain why both mem2reg and sroa seem to trigger it.

Possible. SROA is calling a mem2reg pass on the code. I narrowed it down to that call causing the failure. Something in __umoddi3 is getting miscompiled.

llvmbot commented 15 years ago

It might be a codegen bug that requires variables to be in registers to fire. This would explain why both mem2reg and sroa seem to trigger it.

isanbard commented 15 years ago

First bad .ll that executes incorrectly

isanbard commented 15 years ago

.ll file that executes correctly.

isanbard commented 15 years ago

Okay. It's the SROA pass. If I remove that from the list, then it passes. If you run it alone on the file, it fails. Though it still looks as if the mem2reg causes problems (if I remove SROA, but keep mem2reg, then it fails).

isanbard commented 15 years ago

I was able to reproduce it with this:

$ opt -mem2reg < testcase.bc | llc -o t.s -f -mtriple=i386-apple-darwin9.5 $ gcc -o t t.s $ ./t -1

Mem2Reg is causing the problem?!

llvmbot commented 15 years ago

Bill, do you have sometime to look at this? Thanks.

llvmbot commented 15 years ago

Testcase .ll Here is a testcase .ll that combines the original code and the code for muldi3. It shows the bug: if compiled as is with llc, the output is -10347 (correct). If compiled with opt -std-compile-opts followed by llc then the output is -1.

regehr commented 15 years ago

Ouch... talk about motivation for end-to-end testing.

llvmbot commented 15 years ago

It looks like llvm-gcc has a not working __umoddi3 in its libgcc.

llvmbot commented 15 years ago

If I compile to assembler using llvm-gcc, then assemble and link using gcc, then the problem goes away. So llvm-gcc is producing correct assembler. Investigating...

llvmbot commented 15 years ago

I can reproduce this on x86-32 linux but not on x86-64 linux.

llvmbot commented 15 years ago

Can you attach .s file generated directly from llvm-gcc and that produced by llc?

regehr commented 15 years ago

Unfortunately I still get this:

regehr@john-home:~/volatile/tmp71$ llvm-gcc small.c -o small ; ./small -1

regehr@john-home:~/volatile/tmp71$ llvm-gcc -S --emit-llvm small.c -o - | llvm-as | llc -relocation-model=pic -disable-fp-elim -f -o small.s ; gcc small.s -o small ; ./small -10347

llvmbot commented 15 years ago

Can you try the following llc options? -relocation-model=pic -disable-fp-elim. Thanks.

regehr commented 15 years ago

Also I verified that r60392 still has the bug.

regehr commented 15 years ago

lli gives me the correct answer, as does an executable obtained by running llc on an emitted bytecode file.

Is there something else I can try?

llvmbot commented 15 years ago

I can't reproduce on x86 / Mac OS X. Can you (or someone on Linux) bugpoint it on a Linux system?

llvm / llvm-project

possible integer miscompilation #3521

Extended Description

include

include

include

include