Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

incompatible return of small struct in 32-bit PowerPC BSD #39708

Closed Quuxplusone closed 4 years ago

Quuxplusone commented 5 years ago
Bugzilla Link PR40736
Status RESOLVED FIXED
Importance P normal
Reported by George Koehler (kernigh@gmail.com)
Reported on 2019-02-14 18:38:10 -0800
Last modified on 2020-05-28 16:54:45 -0700
Version 7.0
Hardware Macintosh OpenBSD
CC chmeeedalf@gmail.com, hfinkel@anl.gov, llvm-bugs@lists.llvm.org, marklmi26-fbsd@yahoo.com, nemanja.i.ibm@gmail.com
Fixed by commit(s)
Attachments llvm-9-sret-1.diff (10741 bytes, text/plain)
llvm-9-sret-2.diff (10237 bytes, text/plain)
llvm-9-sret-3.diff (11792 bytes, text/plain)
llvm-9-sret-4.diff (12467 bytes, text/plain)
Blocks
Blocked by
See also
clang 7.0.1 and gcc use incompatible conventions to return a small struct (of
up to 8 bytes) in my PowerBook G4 running OpenBSD/macppc, where gcc is the main
compiler and clang is a recent arrival.  This causes my qt5 built with clang to
crash when trying to call my libxcb built with gcc.  Functions like
xcb_intern_atom() return a cookie as a 4-byte struct containing an unsigned int.

llvm/lib/Target/PowerPC provides the RetCC_PPC convention, but I don't see
where it returns structs.  llvm and clang are returning structs in memory, with
the caller passing in r3 a pointer to the return area.  gcc in OpenBSD returns
smaller structs in registers r3 and r4.  gcc in NetBSD/macppc seems like
OpenBSD.  I don't know what happens in FreeBSD.

There are 2 versions of the ELF ABI (called "SVR4" in llvm):

- System V ABI: PowerPC Processor Supplement (1995)
- Power Architecture 32-bit ABI Supplement 1.0 (2011)
  It added features like secure PLT and thread-local storage.
  Search for Power-Arch-32-bit-ABI-supp-1.0-Unified.pdf

The later ABI from 2011, in section 3.2.5 Return Values, said,

> ATR-LINUX: Aggregates or unions of any length will be returned in a storage
> buffer allocated by the caller. The caller will pass the address of this
> buffer as a hidden first argument in r3, causing the first explicit argument
> to be passed in r4. This hidden argument is treated as a normal formal
> parameter, and corresponds to the first doubleword of the parameter save
> area.
>
> ATR-EABI: Aggregates or unions whose size is less than or equal to eight
> bytes shall be returned in r3 and r4, as if they were first stored in memory
> area and then the low-addressed word were loaded in r3 and the
> high-addressed word were loaded into r4. Bits beyond the last member of the
> structure or union are not defined.

Larger structs in ATR-EABI get returned like in ATR-LINUX.  llvm and clang
follow ATR-LINUX.  gcc in OpenBSD is almost like ATR-EABI, but (in my big-
endian PowerPC) takes the undefined bytes from before the first member, not
"beyond the last member of the structure or union".

I compiled this code with -S in clang 7.0.1 and in gcc:

struct s1 { char c; };
struct s4 { char c[4]; };
struct s7 { char c[7]; };
struct s8 { char c[8]; };
struct s9 { char c[9]; };
struct sd { double d; };
struct s1 ret1(struct s1 *s) { return *s; }
struct s4 ret4(struct s4 *s) { return *s; }
struct s7 ret7(struct s7 *s) { return *s; }
struct s8 ret8(struct s8 *s) { return *s; }
struct s9 ret9(struct s9 *s) { return *s; }
struct sd retd(struct sd *s) { return *s; }

In clang, all these functions get the return area in r3, and the parameter s in
r4, so they copy the correct number of bytes from where r4 points to where r3
points.  In gcc, all but ret9() get the parameter in r3, and copy bytes from
where r3 points to r3 itself, or to r3 and r4.

I now show instructions from
$ egcc -O2 -fno-stack-protector -S retxam.c

This is gcc 8.2.0 with OpenBSD's patches.  Older versions of gcc in NetBSD and
OpenBSD write different instructions but seem to get the same result.  I add my
own comments and change 3 to %r3.

ret1 in gcc:
lbz %r3, 0(%r3)  # r3 = 0.0.0.c

ret4 in gcc:
lwz %r3, 0(%r3)  # r3 = c0.c1.c2.c3

ret7 in gcc:
lhz %r8, 4(%r3)             # r8 = 0.0.c4.c5
lwz %r7, 0(%r3)             # r7 = c0.c1.c2.c3
rlwinm %r6, %r8, 8, 8, 15   # r6 = 0.c4.0.0
lbz %r4, 6(%r3)             # r4 = 0.0.0.c6
slwi %r10, %r7, 24          # r10 = c3.0.0.0
rlwinm %r8, %r8, 8, 16, 23  # r8 = 0.0.c5.0
or %r9, %r10, %r6           # r9 = c3.c4.0.0
srwi %r3, %r7, 8            # r3 = 0.c0.c1.c2
or %r9, %r9, %r8            # r9 = c3.c4.c5.0
or %r4, %r9, %r4            # r4 = c3.c4.c5.c6

ret7 in gcc looks overly long.  I would try to avoid the rotations by using
misaligned loads: lwz %r4, 3(%r3) and lhz *, 1(%r3)

ret8 in gcc:
lwz %r4, 4(%r3)  # r4 = c4.c5.c6.c7
lwz %r3, 0(%r3)  # r3 = c0.c1.c2.c3

ret9 in gcc:
lwz %r7, 0(%r4)   # r7 = c0.c1.c2.c3
lwz %r8, 4(%r4)   # r8 = c4.c5.c6.c7
lbz %r10, 8(%r4)  # r10 = 0.0.0.c8
stw %r7, 0(%r3)   # r3[0..3] = c0.c1.c2.c3
stw %r8, 4(%r3)   # r3[4..7] = c4.c5.c6.c7
stb %r10, 8(%r3)  # r3[8] = c8

retd in gcc is exactly like ret8: it puts the struct's 8-byte double in r3 and
r4, not in a floating-point register.

Right now, I can compile clang from llvm-project.git master, but I can't run
it, so I don't know how it returns structs.  I am able to run OpenBSD's package
of clang 7.0.1.
Quuxplusone commented 5 years ago

Attached llvm-9-sret-1.diff (10741 bytes, text/plain): draft 1, incomplete, sret-in-reg for callee but not caller

Quuxplusone commented 5 years ago

Attached llvm-9-sret-2.diff (10237 bytes, text/plain): draft 2, incomplete, small sret IR pass

Quuxplusone commented 5 years ago

Attached llvm-9-sret-3.diff (11792 bytes, text/plain): draft 3, small sret in PPCTargetLowering

Quuxplusone commented 5 years ago

Attached llvm-9-sret-4.diff (12467 bytes, text/plain): draft 4, new comment with examples of returned small structs

Quuxplusone commented 4 years ago

I am not particularly familiar with 32-bit ABIs. Perhaps Justin is more familiar with this stuff and can chime in here.

Quuxplusone commented 4 years ago

This seems to bite us on FreeBSD as well, as reported recently at https://lists.freebsd.org/pipermail/freebsd-toolchain/2020-January/005190.html . Being an all-clang or all-gcc world makes it less obvious, but mixing them is problematic.

Quuxplusone commented 4 years ago
(In reply to Justin Hibbits from comment #6)
> This seems to bite us on FreeBSD as well, as reported recently at
> https://lists.freebsd.org/pipermail/freebsd-toolchain/2020-January/005190 .
> html .  Being an all-clang or all-gcc world makes it less obvious, but
> mixing them is problematic.

There is also:
https://lists.freebsd.org/pipermail/freebsd-toolchain/2020-January/005192.html
for C instead of C++.

The context here is clang 9.0.1 as FreeBSD has it in the system, where I
had the system at head -r356426 .

Both examples were simple cases, such as a pair of ints or a 64 bit value
in the struct(/class). In such simple contexts, I could use -aix-struct-return
with gcc9/g++9 to avoid the ABI incompatibility (programs crashed
otherwise: FreeBSD library code using r3 as holding an address when it
did not hold an address --or when it held a wrong address for the code's
purpose).

I have not yet got around to the more complicated cases, such as
referenced in last year's comments: they were not involved  in the
program that I ran into the issue with.

The delay in noticing is tied to FreeBSD just recently having
switched from gcc 4.2.1 as the official system compiler to clang
9.0.1 instead. This update changed the ABI in use without it being
announced.

If the gcc variant of the svr4 ABI is fully put back, this is be a
breaking ABI change for 32-bit FreeBSD compared to the recent switch
to clang 9.0.1, but not with its long history. Being tier 2, 32-bit
powerpc FreeBSD can probably do this, but it would be far better if it
happened before FreeBSD 13 moves from head-status to
stable/release-status.

I do not know if there are issues of gcc 4.2.1 vintage svr4 ABI vs. more
modern incremental gcc svr4 ABI updates. So far I've not seen anything
from FreeBSD folks about even an incremental change being intended. But
over the years, there might have been details that needed to be adjusted
to deal with C++ issues or some such. I do not know the details. I do
know that g++9 defines:

#define __GXX_ABI_VERSION 1013

but clang++ (9.0.1) defines:

#define __GXX_ABI_VERSION 1002
Quuxplusone commented 4 years ago
(In reply to Mark Millard from comment #7)

(In reply to George Koehler from comment #0)
> . . .
> Larger structs in ATR-EABI get returned like in ATR-LINUX.  llvm and clang
> follow ATR-LINUX.  gcc in OpenBSD is almost like ATR-EABI, but (in my
> big-endian PowerPC) takes the undefined bytes from before the first member,
> not "beyond the last member of the structure or union".

I did an experiment that involved comparing and
contrasting use of two of struct types with
a leading char involved. Only one got the
"undefined bytes from before the first member":
(compiled using: -std=c99 -pedantic -g -O2 -c)

struct char_int {
  char a; int b;
};

extern struct char_int charint(void);
extern void char_int_argp(struct char_int,char,int);

void charint_access(void) {
   struct char_int r= charint();
   char_int_argp(r,r.a,r.b);
}

struct five_char {
  char a[5];
};

extern struct five_char fivechar(void);
extern void five_char_argp(struct five_char,char,char,char,char,char);

void fivechar_access(void) {
   struct five_char r= fivechar();
   five_char_argp(r,r.a[0],r.a[1],r.a[2],r.a[3],r.a[4]);
}

For gcc9 I got:

000000b0 charint_access:
; void charint_access(void) {
      b0: 94 21 ff f0                   stwu 1, -16(1)
      b4: 7c 08 02 a6                   mflr 0
      b8: 90 01 00 14                   stw 0, 20(1)
;    struct char_int r= charint();
      bc: 48 00 00 01                   bl .+0
      c0: 7c 69 1b 78                   mr 9, 3
      c4: 7c 85 23 78                   mr 5, 4
;    char_int_argp(r,r.a,r.b);
      c8: 38 61 00 08                   addi 3, 1, 8
      cc: 55 24 46 3e                   srwi 4, 9, 24
Note: MSByte from r3 (via r9) copied to least significant byte of r4.
Note: Pad is in the least significant bytes of r3, so later in memory.

      d0: 90 a1 00 0c                   stw 5, 12(1)
      d4: 91 21 00 08                   stw 9, 8(1)
      d8: 48 00 00 01                   bl .+0
; }
      dc: 80 01 00 14                   lwz 0, 20(1)
      e0: 38 21 00 10                   addi 1, 1, 16
      e4: 7c 08 03 a6                   mtlr 0
      e8: 4e 80 00 20                   blr

000000ec fivechar_access:
; void fivechar_access(void) {
      ec: 94 21 ff d0                   stwu 1, -48(1)
      f0: 7c 08 02 a6                   mflr 0
      f4: 90 01 00 34                   stw 0, 52(1)
;    struct five_char r= fivechar();
      f8: 48 00 00 01                   bl .+0
      fc: 54 87 46 3e                   srwi 7, 4, 24
     100: 54 88 84 3e                   srwi 8, 4, 16
     104: 54 8a c2 3e                   srwi 10, 4, 8
     108: 98 61 00 08                   stb 3, 8(1)
Note: Least significant byte of r3 holds the first char.
Note: Pad is in the most significant bytes of r3.

     10c: 98 e1 00 09                   stb 7, 9(1)
     110: 7c 89 23 78                   mr 9, 4
     114: 99 01 00 0a                   stb 8, 10(1)
;    five_char_argp(r,r.a[0],r.a[1],r.a[2],r.a[3],r.a[4]);
     118: 54 88 06 3e                   clrlwi  8, 4, 24
;    struct five_char r= fivechar();
     11c: 99 41 00 0b                   stb 10, 11(1)
Note: The new layout starting at 8(r1) has no pad in the 1st 4 bytes.

;    five_char_argp(r,r.a[0],r.a[1],r.a[2],r.a[3],r.a[4]);
     120: 38 61 00 20                   addi 3, 1, 32
     124: 88 c1 00 0a                   lbz 6, 10(1)
     128: 81 41 00 08                   lwz 10, 8(1)
     12c: 88 e1 00 0b                   lbz 7, 11(1)
     130: 88 a1 00 09                   lbz 5, 9(1)
     134: 88 81 00 08                   lbz 4, 8(1)
     138: 99 21 00 24                   stb 9, 36(1)
Note: The pad is after 36(r1) for the copy here.

     13c: 91 41 00 20                   stw 10, 32(1)
     140: 48 00 00 01                   bl .+0
; }
     144: 80 01 00 34                   lwz 0, 52(1)
     148: 38 21 00 30                   addi 1, 1, 48
     14c: 7c 08 03 a6                   mtlr 0
     150: 4e 80 00 20                   blr

For clang 9.0.1 I got:

000000b0 charint_access:
; void charint_access(void) {
      b0: 7c 08 02 a6                   mflr 0
      b4: 90 01 00 04                   stw 0, 4(1)
      b8: 94 21 ff e0                   stwu 1, -32(1)
      bc: 38 61 00 18                   addi 3, 1, 24
;    struct char_int r= charint();
      c0: 48 00 00 01                   bl .+0
;    char_int_argp(r,r.a,r.b);
      c4: 80 61 00 18                   lwz 3, 24(1)
      c8: 80 a1 00 1c                   lwz 5, 28(1)
      cc: 88 81 00 18                   lbz 4, 24(1)
Note: No pad before the char field.
Note: Instead it is in the least significant bytes of r3.

      d0: 90 61 00 08                   stw 3, 8(1)
      d4: 38 61 00 08                   addi 3, 1, 8
      d8: 90 a1 00 0c                   stw 5, 12(1)
Note: Overall, matches gcc9 for layout.

      dc: 48 00 00 01                   bl .+0
; }
      e0: 80 01 00 24                   lwz 0, 36(1)
      e4: 38 21 00 20                   addi 1, 1, 32
      e8: 7c 08 03 a6                   mtlr 0
      ec: 4e 80 00 20                   blr

000000f0 fivechar_access:
; void fivechar_access(void) {
      f0: 7c 08 02 a6                   mflr 0
      f4: 90 01 00 04                   stw 0, 4(1)
      f8: 94 21 ff e0                   stwu 1, -32(1)
      fc: 38 61 00 18                   addi 3, 1, 24
;    struct five_char r= fivechar();
     100: 48 00 00 01                   bl .+0
;    five_char_argp(r,r.a[0],r.a[1],r.a[2],r.a[3],r.a[4]);
     104: 80 61 00 18                   lwz 3, 24(1)
     108: 89 01 00 1c                   lbz 8, 28(1)
     10c: 88 81 00 18                   lbz 4, 24(1)
Note: No pad before the array's [0] element, unlike gcc9.
Note: The pad is after 28(r1).
Note: No reorganization follows, unlike gcc9.

     110: 88 a1 00 19                   lbz 5, 25(1)
     114: 88 c1 00 1a                   lbz 6, 26(1)
     118: 88 e1 00 1b                   lbz 7, 27(1)
     11c: 90 61 00 08                   stw 3, 8(1)
     120: 38 61 00 08                   addi 3, 1, 8
     124: 99 01 00 0c                   stb 8, 12(1)
     128: 48 00 00 01                   bl .+0
; }
     12c: 80 01 00 24                   lwz 0, 36(1)
     130: 38 21 00 20                   addi 1, 1, 32
     134: 7c 08 03 a6                   mtlr 0
     138: 4e 80 00 20                   blr
Quuxplusone commented 4 years ago
The "draft 4" patch that I attached to this bug causes LLVM to crash on
code that uses function pointers.  Don't use this patch.

I'm having trouble with my PowerPC hardware,
so I can't run PowerPC code right now.

(In reply to Justin Hibbits from comment #6)
> This seems to bite us on FreeBSD as well, as reported recently at
> https://lists.freebsd.org/pipermail/freebsd-toolchain/2020-January/005190.
> html .  Being an all-clang or all-gcc world makes it less obvious, but
> mixing them is problematic.

Other than clang and gcc, a few other packages need to know whether to
return small structs in memory or in registers r3/r4:

- libffcall and libffi, because they allow foreign languages to call
  C functions at runtime, must know how to get the return value from C.

- boost_context has extern "C" calls between C++ and 32-bit PowerPC assembly,
  but some calls return an 8-byte struct, so the assembly code must know
  which convention to use.

There is no C preprocessor macro to decide whether a system uses memory or
r3/r4 to return small structs.  Boost context pulled my code which uses
`#ifdef __linux__` to decide: I assume that Linux uses memory, and any other
system uses r3/r4.  (I tried Linux, NetBSD, and OpenBSD.)
https://github.com/boostorg/context/pull/123

I recently found this code in libffi, at
https://github.com/libffi/libffi/blob/b844a9c7f1ca792a1dfb0c09d5dae576178e6729/src/powerpc/ffi_sysv.c#L149

=begin
    case FFI_TYPE_STRUCT:
      /* The final SYSV ABI says that structures smaller or equal 8 bytes
     are returned in r3/r4.  A draft ABI used by linux instead
     returns them in memory.  */
      if ((cif->abi & FFI_SYSV_STRUCT_RET) != 0 && size <= 8)
    ...
=end

The default for FFI_SYSV_STRUCT_RET is at
https://github.com/libffi/libffi/blob/73dd43afc8a447ba98ea02e9aad4c6898dc77fb0/src/powerpc/ffitarget.h#L122

=begin
#  if (defined (__SVR4_STRUCT_RETURN)                   \
       || defined (POWERPC_FREEBSD) && !defined (__AIX_STRUCT_RETURN))
             | FFI_SYSV_STRUCT_RET
#  endif
=end

Nothing seems to define __SVR4_STRUCT_RETURN nor __AIX_STRUCT_RETURN, so
I guess that libffi returns small structs in r3/r4 on FreeBSD, and in
memory on other systems (so libffi is wrong on NetBSD and OpenBSD).
I haven't tested libffi to learn what really happens.

I suspect that libffcall always returns small structs in r3/r4,
but I'm not sure.

(In reply to Mark Millard from comment #7)
> I do not know if there are issues of gcc 4.2.1 vintage svr4 ABI vs. more
> modern incremental gcc svr4 ABI updates. So far I've not seen anything
> from FreeBSD folks about even an incremental change being intended. But
> over the years, there might have been details that needed to be adjusted
> to deal with C++ issues or some such....

OpenBSD macppc uses gcc 4.2.1 and gcc 8.3.0; it builds the base system
with 4.x and packages with either 4.x or 8.x.  In my experience, gcc 4.x
and 8.x use the same ABI for C code.  Packages built with 8.x have no
trouble calling C functions in libraries built with 4.x.

C++ is not compatible, because libstdc++ 4.2.1 and libstdc++ 8.3.0 are
different libraries (like how libstdc++ and LLVM libc++ are different).
Packages built with gcc 8.x can't call C++ libraries built with 4.x, so
OpenBSD macppc uses 8.x for almost all C++ code.

(In reply to Mark Millard from comment #8)
> (In reply to George Koehler from comment #0)
> > . . .
> > Larger structs in ATR-EABI get returned like in ATR-LINUX.  llvm and clang
> > follow ATR-LINUX.  gcc in OpenBSD is almost like ATR-EABI, but (in my
> > big-endian PowerPC) takes the undefined bytes from before the first member,
> > not "beyond the last member of the structure or union".
>
> I did an experiment that involved comparing and
> contrasting use of two of struct types with
> a leading char involved. Only one got the
> "undefined bytes from before the first member":
> (compiled using: -std=c99 -pedantic -g -O2 -c)
>
> struct char_int {
>   char a; int b;
> };

This looks like an 8-byte struct: sizeof(struct char_int) should be 8,
though I can't check right now.  The "int b;" has 4-byte alignment, so
there should be a 3-byte alignment gap in the middle of the struct,
between a and b.

> ...
> For gcc9 I got:
> ...
> Note: MSByte from r3 (via r9) copied to least significant byte of r4.
> Note: Pad is in the least significant bytes of r3, so later in memory.

Because we already have 8 bytes, we don't add any bytes to return
our struct char_int in r3/r4.  We preserve the layout of the struct,
with the 3-byte gap between "char a" and "int b".  We are big-endian,
so are struct begins at the most significant byte of r3, and ends at
the least significant byte of r4.

(Other ABIs, like AMD64 and PowerPC 64-bit, might not preserve the layout
of the struct, because they might split the struct into its members.)
Quuxplusone commented 4 years ago
(In reply to George Koehler from comment #9)

> C++ is not compatible, because libstdc++ 4.2.1 and libstdc++ 8.3.0 are
> different libraries (like how libstdc++ and LLVM libc++ are different).
> Packages built with gcc 8.x can't call C++ libraries built with 4.x, so
> OpenBSD macppc uses 8.x for almost all C++ code.

Yea. Right now trying to use FreeBSD's headers and its library libc++
code via gcc9 (not using libstdc++) can run into the ABI issue. That
is how I discovered the issue in the first place. Using
std::chrono::steady_clock::now() ends up using a 8 byte struct/class
returned via a function.

> (In reply to Mark Millard from comment #8)
> > (In reply to George Koehler from comment #0)
> > > . . .
> > > Larger structs in ATR-EABI get returned like in ATR-LINUX.  llvm and clang
> > > follow ATR-LINUX.  gcc in OpenBSD is almost like ATR-EABI, but (in my
> > > big-endian PowerPC) takes the undefined bytes from before the first
member,
> > > not "beyond the last member of the structure or union".
> >
> > I did an experiment that involved comparing and
> > contrasting use of two of struct types with
> > a leading char involved. Only one got the
> > "undefined bytes from before the first member":
> > (compiled using: -std=c99 -pedantic -g -O2 -c)
> >
> > struct char_int {
> >   char a; int b;
> > };
>
> This looks like an 8-byte struct: sizeof(struct char_int) should be 8,
> though I can't check right now.  The "int b;" has 4-byte alignment, so
> there should be a 3-byte alignment gap in the middle of the struct,
> between a and b.

Yep: 3 pad bytes between a and b. Sounds like I misinterpreted what you
were intending to indicate in your original wording. I had thought you
were indicating that the gcc pad would end up before a, not between a
and b. So, not so useful of a pair of examples, sorry.
Quuxplusone commented 4 years ago
I made a patch for this bug at
https://reviews.llvm.org/D73290

I reported this bug on LLVM's Backend: PowerPC, but the patch that I proposed
just now is in Clang.

My previous attempts to fix this bug failed to optimize away unnecessary
accesses to memory.  I recently noticed that Clang for PPC64 ELFv2 does already
return some structs in registers, so I found the PPC64 code in Clang, and try
to do something similar for PPC32.

LLVM IR uses a sret pointer to return structs, but the pointer forces the
struct to be in memory.  If I use a machine-dependent IR pass or SelectionDAG
to move the struct into registers, then it is too late to optimize away the
memory accesses.  Clang gets around this, on some platforms, by coercing the
return value to another type (like an integer), so there is no sret pointer in
the LLVM IR.  Then LLVM passes like SROA (scalar reduction of aggregates) can
optimize away the memory accesses.

There is a disadvantage: suppose that a compiler for another language emits
LLVM IR and wants to return a struct to C, or get a returned struct from C.  If
such a compiler uses a sret pointer, but Clang doesn't use a sret pointer, then
the other compiler is not compatible with Clang.
Quuxplusone commented 4 years ago
(In reply to Mark Millard from comment #10)
> Yea. Right now trying to use FreeBSD's headers and its library libc++
> code via gcc9 (not using libstdc++) can run into the ABI issue. That
> is how I discovered the issue in the first place. Using
> std::chrono::steady_clock::now() ends up using a 8 byte struct/class
> returned via a function.

I have the opposite situation in OpenBSD macppc, where clang 8.0.1 and gcc
8.3.0 both use the libstdc++ from gcc 8.3.0. This clang has my patch:
https://reviews.llvm.org/D73290

In this libstdc++, std::chrono::steady_clock::now() also returns an 8 byte
struct/class.  The good news is that now() appears to work correctly when I
build the caller with the patched clang.  The bad news is that clang is slower
than gcc (on my iMac G3, 1x 400 MHz PowerPC 750); both compilers seem to take
too long to compile the C++ templates in the headers.

$ cat chr.cpp
#include <chrono>
#include <iostream>
namespace chr = std::chrono;

template <class C, class D>
static chr::milliseconds
ms(chr::time_point<C, D> tp)
{
    auto dur = tp.time_since_epoch();
    return chr::duration_cast<chr::milliseconds>(dur);
}

int
main()
{
    auto c = chr::system_clock::now();
    std::cout << "system clock c = " << ms(c).count() << " ms\n";
    std::cout << "sizeof(c) = " << sizeof(c) << "\n";

    auto m = chr::steady_clock::now();
    std::cout << "steady clock m = " << ms(m).count() << " ms\n";
    std::cout << "sizeof(m) = " << sizeof(m) << "\n";
}
$ ... (trying to warm up the cache)
$ time /usr/local/bin/eg++ -O2 -o chr chr.cpp
    0m07.90s real     0m07.08s user     0m00.69s system
$ time /usr/local/bin/clang++ -O2 -o chr chr.cpp
    0m10.02s real     0m08.71s user     0m01.09s system
$ ./chr && sleep 3 && ./chr
system clock c = 1580956774962 ms
sizeof(c) = 8
steady clock m = 7444322 ms
sizeof(m) = 8
system clock c = 1580956778042 ms
sizeof(c) = 8
steady clock m = 7447402 ms
sizeof(m) = 8
Quuxplusone commented 4 years ago

Is this resolved now that the patch has landed? Can this be closed?

Quuxplusone commented 4 years ago

The diff is in git master. I now set this bug to FIXED.

(I can't build clang from git master right now. I get a linker error: I suspect that recent growth in SemaExpr.cpp triggered a bug in my build tools on OpenBSD/amd64, so I will be looking for that bug. I have been using the diff on an older clang that I can build. clang/test/CodeGen/ppc32-and-aix-struct-return.c would fail if the bug wasn't fixed.)