Open ms178 opened 1 year ago
Is it failing with BOLT-instrumented Clang?
@aaupov I am a bit unsure what you mean, the error is seen while instrumenting Clang for consumption in BOLT while in the stage-3 phase but the script uses the LTO+PGOed stage-2 compiler for that instrumentation step and that compiler wasn't BOLTEd yet. The error happens within the training phase of that script however when compiling some LLVM-projects as training data, so BOLT already ran once.
For more details on my llvm-bolt-scripts, take a look at them in my repo.
To save time, I usually build just the stage-1-compiler and the stage-2-compiler with an older PGO-file, as seen in the build_stage2-prof-use-lto.sh file. If I understand the stack trace correctly (I am not a programmer), it points to a problem with lld or lto. However, the issue only gets exposed when I use the polly flags as included in the top post file and also train on openmp. With a reduced set of the polly flags (without all of the vectorizer parameters) and without openmp, all went fine in that step (see the attached file here). build_stage3-bolt-without-sampling.sh.FINE.txt
can you try without profile data and see if it still crashes?
@llvm/issue-subscribers-polly
One additional note, normally I don't use Polly in the script and could not reproduce the issue without the Polly flags.
It could be polly miscompiling the program? cc @jdoerfert @tobiasgrosser @Meinersbur
can you try without profile data and see if it still crashes?
You seem to confuse this issue with the other one discussed today (both might share the same root cause, though). For clarification, please note that I have two completly different LLVM-Toolchains in my toolchain-experimental repo that are used and optimized for different purposes. The (lib32-)llvm-git package(s) serve as a system compiler replacement, wheras the llvm-bolt-scripts generate a super-optimized Clang that I use for compiling specific packages only and gets build in a completely different way.
The issue reported here was seen with my llvm-bolt-scripts while the other issue discussed today was seen with the llvm-git package. In the script discussed here, I don't use any profile data in that particular stage, however it is used to generate a profile for later consumption by BOLT.
While running a build for instrumenting for BOLT, I came across this build failure in openmp on LLVM/Clang-17 (7c2604ca196c3ba0247509c0fde350e23f0cccb0) when using Polly.
This is the script used for that specific stage that reproduces the issue. build_stage3-bolt-without-sampling.bash.txt
One additional note, normally I don't use Polly in the script and could not reproduce the issue without the Polly flags.