llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
26.83k stars 11k forks source link

-opt-bisect-limit output is written out of order when performing ThinLTO #38527

Open gregbedwell opened 5 years ago

gregbedwell commented 5 years ago
Bugzilla Link 39179
Version trunk
OS Windows NT
CC @andykaylor,@dwblaikie,@Theodor,@pogo59,@pcc,@yuanfang-chen

Extended Description

Using r343783. Potentially related to the recent llvm-dev discussions ("OptBisect implementation for new pass manager") which touch on the interaction between OptBisect and parallel compilation.

The symptom I've observed here is that the output text is all mixed up. I'm not sure if that's the extent of the issue or whether the passes associated with each counter value differ too (I've not managed to observe that happening, but the messed up output text makes it hard to figure that out). Obvious workaround is to explicitly specify -Wl,-thinlto-jobs=1 at the same time. If the resolution of this bug is "this is as expected, just specify -Wl,-thinlto-jobs=1" then that's fine, but I wanted to highlight it as it relates to recent discussions.

greg@greg-win10:/mnt/c/tmp/thinlto-bisect$ cat 1.cpp extern int f(int);

int main(int argc, char**) { int result = 0; for (int i = 0; i < argc; ++i) result += f(i); return result; }

greg@greg-win10:/mnt/c/tmp/thinlto-bisect$ cat 2.cpp extern int f(int x) { for (int i = x; x; --i) { if (i & 5) return i; } return x; }

greg@greg-win10:/mnt/c/tmp/thinlto-bisect$ clang++ -c -O2 -flto=thin 1.cpp 2.cpp

greg@greg-win10:/mnt/c/tmp/thinlto-bisect$ clang++ 1.o 2.o -fuse-ld=lld -Wl,-mllvm -Wl,-opt-bisect-limit=1000 BISECT: running pass (1) Dead Global Elimination on module (ld-temp.o) BISECT: running pass (2) Infer set function attributes on module (ld-temp.o) BISECT: running pass (3) Interprocedural Sparse Conditional Constant Propagation on module (ld-temp.o) BISECT: running pass (4) Called Value Propagation on module (ld-temp.o) BISECT: running pass (5) Deduce function attributes on SCC (<>) BISECT: running pass (6) Deduce function attributes in RPO on module (ld-temp.o) BISECT: running pass (7) Global splitter on module (ld-temp.o) BISECT: running pass (8) Whole program devirtualization on module (ld-temp.o) BISECT: running pass (9) Global Variable Optimizer on module (ld-temp.o) BISECT: running pass (10) Merge Duplicate Global Constants on module (ld-temp.o) BISECT: running pass (11) Dead Argument Elimination on module (ld-temp.o) BISECT: running pass (12) Function Integration/Inlining on SCC (<>) BISECT: running pass (13) Remove unused exception handling info on SCC (<>) BISECT: running pass (14) Global Variable Optimizer on module (ld-temp.o) BISECT: running pass (15) Dead Global Elimination on module (ld-temp.o) BISECT: running pass (16) Promote 'by reference' arguments to scalars on SCC (<>) BISECT: running pass (17) Deduce function attributes on SCC (<>) BISECT: running pass (18) Eliminate Available Externally Globals on module (ld-temp.o) BISECT: running pass (19) Dead Global Elimination on module (ld-temp.o) BISECT: running pass (BISECT: 20running pass ) (Whole program devirtualization21 on ) module (2.o)Whole program devirtualization on module (1.o)BISECT: running pass BISECT: (running pass 22() 23Infer set function attributes) on Infer set function attributesmodule (2.o) on module (1.o)BISECT: running pass BISECT: (running pass 24() 25Interprocedural Sparse Conditional Constant Propagation) on Interprocedural Sparse Conditional Constant Propagationmodule (2.o) on module (1.o)BISECT: running pass BISECT: (running pass 26() 27Called Value Propagation) on Called Value Propagationmodule (2.o) on module (1.o) BISECT: BISECT: running pass running pass ((2829) ) Global Variable OptimizerGlobal Variable Optimizer on on module (2.o)module (1.o)

BISECT: BISECT: running pass running pass ((3031) ) Promote Memory to RegisterPromote Memory to Register on on function (_Z1fi)function (main)

BISECT: BISECT: running pass running pass ((3233) ) Dead Argument EliminationPromote Memory to Register on on module (2.o)function (_Z1fi)

BISECT: BISECT: running pass running pass ((3435) ) Combine redundant instructionsDead Argument Elimination on on function (_Z1fi)module (1.o)

BISECT: BISECT: running pass running pass ((3637) ) Simplify the CFGCombine redundant instructions on on function (_Z1fi)function (main)

BISECT: running pass (38) Simplify the CFG on function (main) BISECT: running pass (39) Remove unused exception handling info on SCC (_Z1fi) BISECT: BISECT: running pass running pass ((4041) ) Function Integration/InliningCombine redundant instructions on on SCC (_Z1fi)function (_Z1fi)

BISECT: BISECT: running pass running pass ((4243) ) Deduce function attributesSimplify the CFG on on SCC (_Z1fi)function (_Z1fi)

BISECT: BISECT: running pass running pass ((4445) ) Remove unused exception handling infoSROA on on SCC (_Z1fi)function (_Z1fi)

dwblaikie commented 5 years ago

Prefixing log lines with some identifier of the thread or process might be useful, if that's feasible. Then you can grep/filter the log to get what happened in each backend thread separately.

llvmbot commented 5 years ago

This is going to be true of any option that causes output from the passes after the Thin Link (another example is -pass-remarks), since ThinLTO uses threads to get parallel execution. I'm not really sure what the solution could be, other than having certain options (like maybe -opt-bisect-limit) force the number of ThinLTO backend threads to 1 (which gives you non-interleaved output at the expense of longer LTO link time). Buffering the output has other issues IMO.