LDC does not compile with SystemZ host

Geod24 commented 4 years ago

While trying to compile LDC v1.19.0 (+ patches from stable) for s390x-musl, I hit this assert.

The package definition can be found here: https://gitlab.alpinelinux.org/alpine/aports/merge_requests/2873

kinke commented 4 years ago

That's probably just the first of many hurdles (and trivial to fix once we know how to detect such hosts via C++ predefines and which long double format the architecture uses). Does gdc already properly support SystemZ? I'd have guessed that druntime support etc. is still severely lacking.

Geod24 commented 4 years ago

Does gdc already properly support SystemZ? I'd have guessed that druntime support etc. is still severely lacking.

Surprisingly not. I had to fiddle with a couple of things, but the druntime support is already there (see e.g. https://github.com/dlang/druntime/pull/2357). Dub compiles and passes dub test for example: https://gitlab.alpinelinux.org/alpine/aports/merge_requests/2484

Geod24 commented 4 years ago

Actually I'm seeing failures on a bunch of platforms.

ARMv7:

LLVM ERROR: Cannot select: 0x41b3f28: f64 = bitcast 0x41b4120
  0x41b4120: i64 = build_pair 0x41b3af0, 0x41b4168
    0x41b3af0: i32,ch = CopyFromReg 0x4154bcc, Register:i32 %15
      0x41b4708: i32 = Register %15
    0x41b4168: i32,ch = CopyFromReg 0x41b3af0:1, Register:i32 %16
      0x41b4828: i32 = Register %16
In function: _memset80
make[2]: *** [runtime/CMakeFiles/druntime-ldc.dir/build.make:335: runtime/objects/core/atomic.o] Error 1
make[2]: Leaving directory '/builds/Geod24/aports/testing/ldc/src/ldc-1.19.0-src/stage1'
make[1]: *** [CMakeFiles/Makefile2:856: runtime/CMakeFiles/druntime-ldc.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
LLVM ERROR: Cannot select: 0x3ed0f78: f64 = bitcast 0x3ed1170
  0x3ed1170: i64 = build_pair 0x3ed0b40, 0x3ed11b8
    0x3ed0b40: i32,ch = CopyFromReg 0x3eb132c, Register:i32 %15
      0x3ed1758: i32 = Register %15
    0x3ed11b8: i32,ch = CopyFromReg 0x3ed0b40:1, Register:i32 %16
      0x3ed1878: i32 = Register %16
In function: _memset80
make[2]: *** [runtime/CMakeFiles/druntime-ldc-shared.dir/build.make:335: runtime/objects-shared/core/atomic.o] Error 1

AArch64:

std/math.d(4891): Error: static assert:  `infL <= 1.18973e+4932L` is false
std/math.d(4891): Error: static assert:  `infL <= 1.18973e+4932L` is false

(https://github.com/ldc-developers/phobos/blob/26d14c1a292267a32ce64fa7f219acc3d3cca274/std/math.d#L4891)

Note that I only compiled LDC on those platforms to see if the GDC porting I did was correct. Since DUB compiles and run I'm inclined to think so, but I could be wrong.

If you are interested in fixing it, feel free to message me directly on Slack. I don't personally need it, but I'm happy to offer assistance if you want to get it fixed. I'd also like to eventually move the LDC package on Alpine to community (currently it's in testing, which is not enabled by default).

kinke commented 4 years ago

Oh, great. LDC probably lacks the required predefined versions (S390, SystemZ); threads/fibers in druntime probably also need some assembly parts (e.g., https://github.com/dlang/druntime/blob/aff248cb8917767b5fe1c99464b0aa02a174c8d1/src/core/thread/osthread.d#L2550).

Getting dub to compile is simpler than LDC because of required C++ interop (and so a halfway correct C++ ABI); while ABI stuff is apparently not much of a deal for gdc, getting LDC to correctly compile itself on SystemZ probably requires more work.

kinke commented 4 years ago

Wrt. ARMv7 and AArch64, these ports aren't complete yet. As druntime compiles for Android/ARMv7 though (covered by LDC CI), I suspect you might only need a -mcpu=cortex-a8 to fix the core.atomic compile error (no 64-bit CAS for default CPU?).

AArch64 compiles fine for our CI too, incl. that CTFE test. As it's CTFE, the D host compiler might play a role. Our CI uses LDC as host compiler; the used OS is Ubuntu 16.04 (=> glibc, not musl; an imprecise C sqrtl could be a potential culprit too).

kinke commented 4 years ago

If you are interested in fixing it, feel free to message me directly on Slack. I don't personally need it, but I'm happy to offer assistance if you want to get it fixed.

No personal interest either; I consider 32-bit ARM as legacy target, and I don't have any AArch64 device for testing (and qemu is IMO too slow).

I'd also like to eventually move the LDC package on Alpine to community (currently it's in testing, which is not enabled by default).

Thx for that.

Cogitri commented 4 years ago

AArch64 compiles fine for our CI too, incl. that CTFE test. As it's CTFE, the D host compiler might play a role. Our CI uses LDC as host compiler; the used OS is Ubuntu 16.04 (=> glibc, not musl; an imprecise C sqrtl could be a potential culprit too).

So I just built LDC on Aarch64 Alpine Linux with that static assert disabled and then tried to build ldc itself with the ldc I built earlier and it still hits the static assert, so I doubt that this is on GDC. Is there something I can do to help debug&fix this?

kinke commented 4 years ago

Sure; I still suspect musl's sqrtl. You could check and see if core.math.sqrt(real.min_normal) already returns the unexpected infinity.

Cogitri commented 4 years ago

This program:

import std.stdio;
import core.math;

void main ()
{
        writeln("ret: ", sqrtl(real.min_normal));
}

returns:

cogitri-edge-aarch64:~$ ./main
ret: 0

Cogitri commented 4 years ago

Ah, seems like that's only the case with ldc, with gdc it returns:

ret: 1.8336e-2466

I don't really get why building ldc with ldmd doesn't work then though :thinking:

kinke commented 4 years ago

Try something like this, printing real.min_normal too, and checking whether the used LLVM intrinsic does not boil down to sqrtl:

import core.stdc.math, core.stdc.stdio;

void main()
{
    printf("sqrt(%Lg) = %Lg\n", real.min_normal, sqrtl(real.min_normal));
    version (LDC)
    {
        import ldc.intrinsics;
        printf("llvm_sqrt = %Lg\n", llvm_sqrt(real.min_normal));
    }
}

The generated asm (without Phobos bloat) is tiny and can be easily compared across LDC and GDC.

Cogitri commented 4 years ago

LDC:

sqrt(3.3621e-4932) = 0
llvm_sqrt = 0

GDC: sqrt(3.3621e-4932) = 1.8336e-2466

kinke commented 4 years ago

%La would produce accurate hex output, just to rule out there are tiny diffs wrt. real.min_normal. - The used GDC uses the same musl libc, right? Try compiling the object file with LDC (-betterC, and extern(C) int main() accordingly), but link with GDC, just in case there are different or additional C libs in use by GDC.

Cogitri commented 4 years ago

The used GDC uses the same musl libc, right?

Yes, in both cases the system musl libc (1.1.24) is used

LDC normal:

sqrt(0x1p-16382) = 0x0p+0
llvm_sqrt = 0x0p+0

LDC betterC w/ GDC link:

sqrt(0x1p-16382) = 0x0p+0
llvm_sqrt = 0x0p+0

I compiled that with:

ldc2 -betterC print.d --output-o
gdc -o print print.o

kinke commented 4 years ago

Hmm, then the answer must be hidden in the asm; try comparing the betterC versions of LDC (-output-s) and GDC (don't know if it has such a switch, otherwise use objdump or so for disassembling).

Cogitri commented 4 years ago

GDC asm:

https://gist.github.com/Cogitri/db64c18c231c446e26417408908b961a

LDC asm:

https://gist.github.com/Cogitri/89ace38ec5ff5907f61966a63ddae03c

kinke commented 4 years ago

Okay, so GDC doesn't actually call sqrt[l] anywhere, possibly enforcing constant propagation here...

Cogitri commented 4 years ago

Yup, but as mentioned even when compiling with ldmd the static assert is triggered :/

kinke commented 4 years ago

LDC uses the host D compiler's core.math.sqrt for CTFE, which uses the LLVM intrinsic with LDC host compilers, which boils down to C sqrtl in this case too, so the failing assertion makes perfect sense and seems indeed due to musl's sqrtl returning 0 for real.min_normal.

Edit: If that's the correct file, there's a FIXME in musl src wrt. this, and it's actually just using double-precision: https://git.musl-libc.org/cgit/musl/tree/src/math/sqrtl.c (there's no src/math/aarch64/sqrtl.*, unlike for e.g. s390x and x86_64)

Cogitri commented 4 years ago

Oh yes, seems like that's it! I guess I'll just have to disable that static assert on aarch64 & s390x until the musl folks fixed that

Cogitri commented 4 years ago

So it does compile with that disabled, but it seems like there's (way) more stuff wrong, from the LDC testsuite: 51% tests passed, 837 tests failed out of 1692. About all of these SEGFAULT. The phobos2-test-runner-debug has the following stacktrace:

#0  _D2gc9pooltable__T9PoolTableTSQBc4impl12conservativeQBy4PoolZQBr6lengthMxFNaNbNdNiNfZm (this=<error reading variable: Cannot access memory at address 0x2f7fbddb0>) at pooltable.d:56
#1  0x0000aaaaae559eb0 in _D2gc4impl12conservativeQw3Gcx6npoolsMxFNaNbNdZm (this=<optimized out>) at gc.d:1236
#2  0x0000aaaaae55c058 in _D2gc4impl12conservativeQw3Gcx8bigAllocMFNbmKmkxC8TypeInfoZ8tryAllocMFNbZb () at gc.d:1721
#3  0x0000aaaaae55b8d0 in _D2gc4impl12conservativeQw3Gcx8bigAllocMFNbmKmkxC8TypeInfoZPv (this=<error reading variable: Cannot access memory at address 0x2f7fbdd08>, size=2932915680, alloc_size=@0xfffffffff3b0: 0, 
    bits=2935369440, ti=0xaaaaaef6cfc0) at gc.d:1743
#4  0x0000aaaaae556328 in _D2gc4impl12conservativeQw3Gcx5allocMFNbmKmkxC8TypeInfoZPv (this=<error reading variable: Cannot access memory at address 0x2f7fbdd08>, size=2932915680, alloc_size=@0xfffffffff3b0: 0, 
    bits=2935369440, ti=0xaaaaaef6cfc0) at gc.d:1628
#5  0x0000aaaaae556228 in _D2gc4impl12conservativeQw14ConservativeGC12mallocNoSyncMFNbmkKmxC8TypeInfoZPv (this=0xfffffffff4d8, size=2932915680, bits=2935369440, alloc_size=@0xfffffffff3b0: 0, ti=0xaaaaaef6cfc0)
    at gc.d:389
#6  0x0000aaaaae55615c in _D2gc4impl12conservativeQw14ConservativeGC__T9runLockedS_DQCeQCeQCcQCnQBs12mallocNoSyncMFNbmkKmxC8TypeInfoZPvS_DQEgQEgQEeQEp10mallocTimelS_DQFiQFiQFgQFr10numMallocslTmTkTmTxQCzZQFcMFNbKmKkKmKxQDsZQDl (this=0xfffffffff4d8, _param_0=@0xfffffffff3d0: 2932915680, _param_1=@0xfffffffff3cc: 2935369440, _param_2=@0xfffffffff3b0: 0, _param_3=@0xfffffffff3c0: 0xaaaaaef6cfc0) at gc.d:254
#7  0x0000aaaaae5563f4 in _D2gc4impl12conservativeQw14ConservativeGC6qallocMFNbmkxC8TypeInfoZS4core6memory8BlkInfo_ (this=0xfffffffff4d8, size=2932915680, bits=2935369440, ti=0xaaaaaef6cfc0) at gc.d:417
#8  0x0000aaaaae522dc0 in gc_qalloc (sz=10, ba=2932915680, ti=0xaaaaaef62ee0 <classref>) at proxy.d:176
#9  0x0000aaaaae565608 in _D2gc4impl5protoQo7ProtoGC6qallocMFNbmkxC8TypeInfoZS4core6memory8BlkInfo_ (this=0xfffffffff4d8, size=10, bits=2932915680, ti=0xaaaaaef62ee0 <classref>) at gc.d:111
#10 0x0000aaaaae522dc0 in gc_qalloc (sz=8, ba=10, ti=0xaaaaaed0bde0 <initializer for TypeInfo_a>) at proxy.d:176
#11 0x0000aaaaae516ec0 in _D4core6memory2GC6qallocFNaNbmkxC8TypeInfoZSQBqQBo8BlkInfo_ (sz=8, ba=10, ti=0xaaaaaed0bde0 <initializer for TypeInfo_a>) at memory.d:427
#12 0x0000aaaaae5355a0 in _D2rt8lifetime12__arrayAllocFNaNbmxC8TypeInfoxQlZS4core6memory8BlkInfo_ (arrsize=7, ti=0xaaaaaed0b500 <initializer for TypeInfo_Aya>, tinext=0xaaaaaed0bde0 <initializer for TypeInfo_a>)
    at lifetime.d:494
#13 0x0000aaaaae5391dc in _d_arrayappendcTX (ti=0xaaaaaed0b500 <initializer for TypeInfo_Aya>, px=..., n=2) at lifetime.d:2133
#14 0x0000aaaaae538cd0 in _d_arrayappendT (ti=0xaaaaaed0b500 <initializer for TypeInfo_Aya>, x=..., y=...) at lifetime.d:1953
#15 0x0000aaaaac7580b0 in test_runner._sharedStaticCtor_L80_C1() () at test_runner.d:92
#16 0x0000aaaaae53d648 in _D2rt5minfo__T14runModuleFuncsSQBdQBd11ModuleGroup8runCtorsMFZ9__lambda2ZQChMFAxPyS6object10ModuleInfoZv (modules=...) at minfo.d:858
#17 0x0000aaaaae53d520 in rt.minfo.ModuleGroup.runCtors() (this=0xff) at minfo.d:728
#18 0x0000aaaaae53dc14 in _D2rt5minfo13rt_moduleCtorUZ14__foreachbody1MFKSQBu19sections_elf_shared3DSOZi (sg=...) at minfo.d:796
#19 0x0000aaaaae540060 in _D2rt19sections_elf_shared3DSO7opApplyFMDFKSQBqQBqQyZiZi (dg=...) at sections_elf_shared.d:84
#20 0x0000aaaaae53dbdc in rt_moduleCtor () at minfo.d:793
#21 0x0000aaaaae531088 in rt_init () at dmain2.d:212
#22 0x0000aaaaae5319b4 in _D2rt6dmain212_d_run_main2UAAamPUQgZiZ6runAllMFZv () at dmain2.d:572
#23 0x0000aaaaae531928 in _D2rt6dmain212_d_run_main2UAAamPUQgZiZ7tryExecMFMDFZvZv (dg=...) at dmain2.d:548
#24 0x0000aaaaae531798 in _d_run_main2 (args=..., totalArgsLength=86, mainFunc=<optimized out>) at dmain2.d:607
#25 0x0000aaaaae531494 in _d_run_main (argc=1, argv=0xfffffffffbe8, mainFunc=0xaaaaac7580c8 <D main>) at dmain2.d:392
#26 0x0000aaaaac758108 in main (argc=1, argv=0xfffffffffbe8) at /home/rasmus/aports/community/ldc/src/ldc-1.20.1-src/runtime/druntime/src/core/internal/entrypoint.d:35
#27 0x0000fffff7f70778 in libc_start_main_stage2 (main=0xaaaaac7580e0 <main>, argc=1, argv=0xfffffffffbe8) at src/env/__libc_start_main.c:94
#28 0x0000000000000000 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

kinke commented 4 years ago

That's https://github.com/ldc-developers/ldc/issues/3329 and is fixed in the latest beta. With the apparently non-existent quadruple precision math support of musl for AArch64, there'll be more Phobos failures than with glibc (see #2153). - This issue is about SystemZ, not even Alpine/musl per se, so this doesn't belong here.

kinke commented 4 years ago

The original issue was fixed by the PR above.

ldc-developers / ldc

LDC does not compile with SystemZ host #3270