Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

long double return miscompiled on Solaris/sparcv9 #46698

Open Quuxplusone opened 3 years ago

Quuxplusone commented 3 years ago
Bugzilla Link PR47729
Status NEW
Importance P normal
Reported by Rainer Orth (ro@gcc.gnu.org)
Reported on 2020-10-05 04:43:22 -0700
Last modified on 2020-11-09 06:38:49 -0800
Version trunk
Hardware Sun Solaris
CC efriedma@quicinc.com, jrtc27@jrtc27.com, jyknight@google.com, llvm-bugs@lists.llvm.org, llvm-bugzilla@jfbastien.com, venkatra@cs.wisc.edu
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also
Several tests FAIL on Solaris/sparcv9 where long double is 128 bits:

  Builtins-sparcv9-sunos :: addtf3_test.c
  Builtins-sparcv9-sunos :: divtf3_test.c
  Builtins-sparcv9-sunos :: extenddftf2_test.c
  Builtins-sparcv9-sunos :: extendsftf2_test.c
  Builtins-sparcv9-sunos :: floatditf_test.c
  Builtins-sparcv9-sunos :: floatsitf_test.c
  Builtins-sparcv9-sunos :: floattitf_test.c
  Builtins-sparcv9-sunos :: floatunditf_test.c
  Builtins-sparcv9-sunos :: floatunsitf_test.c
  Builtins-sparcv9-sunos :: floatuntitf_test.c
  Builtins-sparcv9-sunos :: multf3_test.c
  Builtins-sparcv9-sunos :: subtf3_test.c

E.g. addtf3_test.c FAILs with

error in test__addtf3(36.40888825164657541977, 0.96444431369742592240) =
37.37333256534401470898, expected 37.37333256534400134216

The error doesn't happen in a 1-stage build with gcc or in a Debug build.

Via side-by-side debugging with addtf3.c.o compiled with clang -O vs. gcc -O
(everything else from a regular 2-stage clang build), it turned out that both
compilers produce the same result until the very end of __addtf3.  The only
difference is in the final fromRep call, which can be seen with this testcase:

$ cat fr.c
typedef long double fp_t;
typedef __uint128_t rep_t;

fp_t fromRep(rep_t x) {
  const union {
    fp_t f;
    rep_t i;
  } rep = {.i = x};
  return rep.f;
}

gcc -m64 -O produces

fromRep:
    add %sp, -144, %sp
    stx %o0, [%sp+2175]
    stx %o1, [%sp+2183]
    ldd [%sp+2175], %f0
    ldd [%sp+2183], %f2
    jmp %o7+8
     add    %sp, 144, %sp

while clang yields

fromRep:                                ! @fromRep
! %bb.0:                                ! %entry
    save %sp, -144, %sp
    add %fp, 2031, %i2
    or %i2, 8, %i2
    stx %i0, [%fp+2031]
    ldd [%fp+2031], %f0
    ldd [%i2], %f2
    stx %i1, [%i2]
    ret
    restore

The long double return value is supposed to be in %f0 and %f2.  gcc handles
this just fine, and clang gets it right for %f0, too.  However, it stores the
contents of an uninitialized stack slot in %f2 and only then stores the second
half (%i1) of the arg there.

I don't have the slightest idea how to fix this codegen bug, but I have a
workaround patch (to be posted for reference shortly) that wraps the affected
functions in #pragma clang optimize off/on (nothing more than a hack to show
that this fixes all the failures above).
Quuxplusone commented 3 years ago
FWIW this is not clang being miscompiled: I've tried all of

* stage 1-clang from a Release build with gcc
* stage 2-clang from the same build
* stage 2-clang from a Debug build

and they generate the same wrong code at -O and above.
Quuxplusone commented 3 years ago

FWIW a bisect identified

BISECT: running pass (64) Machine Instruction Scheduler on function (fromRep)

as the culprit (on a minimal testcase, not yet confirmed on the real code).