Open rorth opened 3 months ago
I don't know anything about the implementation but I checked and the tests are passing on Linaro's ARM and AArch64 build bots. Both are docker containers on an AArch64 host, so ARM is 32 bit with a 64 bit kernel, in case that's relevant (we don't have true 32 bit build bots anymore).
Looks like those tests failed on -runtime-counter-relocationntime=true
+ continuous mode. Do they always fail on the 32bit sparc? If they only start failing recently, maybe you can bisect the to which commit causing it.
It seems so: as I said, the failure isn't OS-specific (happening on both Linux and Solaris). The oldest logs I still have around are from Linux/sparc64 LLVM 14.0.0 releases builds, so it's certainly not a recent change.
I mean to try the test with Solaris/sparcv9 builds (which I do have around back to LLVM 6), but I noticed a warming from the Solaris linker that may provide a clue to what's wrong:
ld: warning: symbol '__llvm_profile_counter_bias' has differing sizes:
(file /var/tmp/runtime-counter-relocation-17ff25.o value=0x8; file /var/llvm/local-sparcv9-release-stage2-A-flang-gcc14/tools/clang/stage2-bins/lib/clang/20/lib/sunos/libclang_rt.profile-sparc.a(InstrProfilingFile.c.o) value=0x4);
/var/tmp/runtime-counter-relocation-17ff25.o definition taken
In fact, the definition from compiler-rt/lib/profile/InstrProfilingFile.c
COMPILER_RT_VISIBILITY extern intptr_t INSTR_PROF_PROFILE_COUNTER_BIAS_VAR;
is for a 32-bit variable on 32-bit sparc, while the other (compiler-generated) one seems to be an int64_t
/i64
according to the comment in llvm/lib/Transforms/Instrumentation/InstrProfiling.cpp
. This mismatch might be the reason for the bug, although I don't yet see how the i64
comes in here. If that's the root of the problem, things might still work by accident on little-endian targets (e.g. x86), but fail on big-endian ones (sparc).
I've now checked older clang
versions on Solaris/sparc: the failure occurs all the way back to clang-10
, while clang-9
doesn't recognize -runtime-counter-relocation=true
. So this bug seems to be present from day one.
A quick check changing compiler-rt/lib/profile/InstrProfilingFile.c
(__llvm_profile_counter_bias
) from intptr_t
to int64_t
to match llvm/lib/Transforms/Instrumentation/InstrProfiling.cpp
(InstrLowerer::getCounterAddress
) fixes the test indeed. I'll now run a full build/test on several targets to verify.
Btw., __llvm_profile_bitmap_bias
has exactly the same issue: clang
emits it as int64_t
while compiler-rt
expects/uses an intptr_t
.
Two profile tests
FAIL
on 32-bit Linux/sparc64:while the same tests work just fine on 64-bit. It turns out the same failures occur on Solaris/sparcv9: the tests don't seem to be actually Linux-specific. In fact, on Solaris/amd64 they
PASS
for bothi386
andx86_64
.Here are the actual failues:
and
Maybe one of the developers (@ZequanWu, @DavidSpickett, @petrhosek, @vedantk) can shed some light on where to start looking? These are among the very few last failing tests on Linux/sparc64.