llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.06k stars 11.98k forks source link

ContinuousSyncMode/runtime-counter-relocation.c etc. FAIL on 32-bit SPARC #101667

Open rorth opened 3 months ago

rorth commented 3 months ago

Two profile tests FAIL on 32-bit Linux/sparc64:

  Profile-sparc :: ContinuousSyncMode/runtime-counter-relocation.c
  Profile-sparc :: ContinuousSyncMode/set-file-object.c

while the same tests work just fine on 64-bit. It turns out the same failures occur on Solaris/sparcv9: the tests don't seem to be actually Linux-specific. In fact, on Solaris/amd64 they PASS for both i386 and x86_64.

Here are the actual failues:

  Profile-sparc :: ContinuousSyncMode/runtime-counter-relocation.c

compiler-rt/test/profile/ContinuousSyncMode/runtime-counter-relocation.c:14:23: error: CHECK-COUNTS-NEXT: expected string not found in input
// CHECK-COUNTS-NEXT: Function count: 1
                      ^
<stdin>:4:13: note: scanning from here
 Counters: 2
            ^
<stdin>:5:2: note: possible intended match here
 Function count: 0
 ^

  output is

Counters:
  main:
    Hash: 0x000000000a498458
    Counters: 2
    Function count: 0
    Block counts: [0]
Instrumentation level: Front-end
Functions shown: 1
Total functions: 1
Maximum function count: 0
Maximum internal block count: 0

and

  Profile-sparc :: ContinuousSyncMode/set-file-object.c

compiler-rt/test/profile/ContinuousSyncMode/set-file-object.c:32:11: error: MERGE: expected string not found in input
// MERGE: Function count: 32
          ^
<stdin>:9:13: note: scanning from here
 Counters: 1
            ^
<stdin>:10:2: note: possible intended match here
 Function count: 0
 ^

  outpus is

Counters:
  main:
    Hash: 0x275ce4c29c65beba
    Counters: 11
    Function count: 1
    Block counts: [0, 0, 0, 0, 0, 0, 0, 0, 1, 0]
  coverage_test:
    Hash: 0x0000000000000018
    Counters: 1
    Function count: 0
    Block counts: []
Instrumentation level: Front-end
Functions shown: 2
Total functions: 2
Maximum function count: 1
Maximum internal block count: 1

Maybe one of the developers (@ZequanWu, @DavidSpickett, @petrhosek, @vedantk) can shed some light on where to start looking? These are among the very few last failing tests on Linux/sparc64.

DavidSpickett commented 3 months ago

I don't know anything about the implementation but I checked and the tests are passing on Linaro's ARM and AArch64 build bots. Both are docker containers on an AArch64 host, so ARM is 32 bit with a 64 bit kernel, in case that's relevant (we don't have true 32 bit build bots anymore).

ZequanWu commented 3 months ago

Looks like those tests failed on -runtime-counter-relocationntime=true + continuous mode. Do they always fail on the 32bit sparc? If they only start failing recently, maybe you can bisect the to which commit causing it.

rorth commented 3 months ago

It seems so: as I said, the failure isn't OS-specific (happening on both Linux and Solaris). The oldest logs I still have around are from Linux/sparc64 LLVM 14.0.0 releases builds, so it's certainly not a recent change.

I mean to try the test with Solaris/sparcv9 builds (which I do have around back to LLVM 6), but I noticed a warming from the Solaris linker that may provide a clue to what's wrong:

ld: warning: symbol '__llvm_profile_counter_bias' has differing sizes:
    (file /var/tmp/runtime-counter-relocation-17ff25.o value=0x8; file /var/llvm/local-sparcv9-release-stage2-A-flang-gcc14/tools/clang/stage2-bins/lib/clang/20/lib/sunos/libclang_rt.profile-sparc.a(InstrProfilingFile.c.o) value=0x4);
    /var/tmp/runtime-counter-relocation-17ff25.o definition taken

In fact, the definition from compiler-rt/lib/profile/InstrProfilingFile.c

COMPILER_RT_VISIBILITY extern intptr_t INSTR_PROF_PROFILE_COUNTER_BIAS_VAR;

is for a 32-bit variable on 32-bit sparc, while the other (compiler-generated) one seems to be an int64_t/i64 according to the comment in llvm/lib/Transforms/Instrumentation/InstrProfiling.cpp. This mismatch might be the reason for the bug, although I don't yet see how the i64 comes in here. If that's the root of the problem, things might still work by accident on little-endian targets (e.g. x86), but fail on big-endian ones (sparc).

rorth commented 3 months ago

I've now checked older clang versions on Solaris/sparc: the failure occurs all the way back to clang-10, while clang-9 doesn't recognize -runtime-counter-relocation=true. So this bug seems to be present from day one.

rorth commented 3 months ago

A quick check changing compiler-rt/lib/profile/InstrProfilingFile.c (__llvm_profile_counter_bias) from intptr_t to int64_t to match llvm/lib/Transforms/Instrumentation/InstrProfiling.cpp (InstrLowerer::getCounterAddress) fixes the test indeed. I'll now run a full build/test on several targets to verify.

rorth commented 3 months ago

Btw., __llvm_profile_bitmap_bias has exactly the same issue: clang emits it as int64_t while compiler-rt expects/uses an intptr_t.