-opt-bisect-limit output is written out of order when performing ThinLTO

Quuxplusone commented 6 years ago


Bugzilla Link	PR39179
Status	NEW
Importance	P normal
Reported by	Greg Bedwell (greg.bedwell@sony.com)
Reported on	2018-10-04 10:18:18 -0700
Last modified on	2020-01-27 14:15:35 -0800
Version	trunk
Hardware	PC Windows NT
CC	andrew.kaylor@intel.com, dblaikie@gmail.com, fedor.v.sergeev@gmail.com, llvm-bugs@lists.llvm.org, paul_robinson@playstation.sony.com, peter@pcc.me.uk, tejohnson@google.com, yuanfang.chen@sony.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also

Using r343783.  Potentially related to the recent llvm-dev discussions
("OptBisect implementation for new pass manager") which touch on the
interaction between OptBisect and parallel compilation.

The symptom I've observed here is that the output text is all mixed up.  I'm
not sure if that's the extent of the issue or whether the passes associated
with each counter value differ too (I've not managed to observe that happening,
but the messed up output text makes it hard to figure that out).  Obvious
workaround is to explicitly specify -Wl,-thinlto-jobs=1 at the same time.  If
the resolution of this bug is "this is as expected, just specify -Wl,-thinlto-
jobs=1" then that's fine, but I wanted to highlight it as it relates to recent
discussions.

greg@greg-win10:/mnt/c/tmp/thinlto-bisect$ cat 1.cpp
extern int f(int);

int main(int argc, char**) {
  int result = 0;
  for (int i = 0; i < argc; ++i)
    result += f(i);
  return result;
}

greg@greg-win10:/mnt/c/tmp/thinlto-bisect$ cat 2.cpp
extern int f(int x) {
  for (int i = x; x; --i) {
    if (i & 5)
      return i;
  }
  return x;
}

greg@greg-win10:/mnt/c/tmp/thinlto-bisect$ clang++ -c -O2 -flto=thin 1.cpp 2.cpp

greg@greg-win10:/mnt/c/tmp/thinlto-bisect$ clang++ 1.o 2.o -fuse-ld=lld -Wl,-
mllvm -Wl,-opt-bisect-limit=1000
BISECT: running pass (1) Dead Global Elimination on module (ld-temp.o)
BISECT: running pass (2) Infer set function attributes on module (ld-temp.o)
BISECT: running pass (3) Interprocedural Sparse Conditional Constant
Propagation on module (ld-temp.o)
BISECT: running pass (4) Called Value Propagation on module (ld-temp.o)
BISECT: running pass (5) Deduce function attributes on SCC (<<null function>>)
BISECT: running pass (6) Deduce function attributes in RPO on module (ld-temp.o)
BISECT: running pass (7) Global splitter on module (ld-temp.o)
BISECT: running pass (8) Whole program devirtualization on module (ld-temp.o)
BISECT: running pass (9) Global Variable Optimizer on module (ld-temp.o)
BISECT: running pass (10) Merge Duplicate Global Constants on module (ld-temp.o)
BISECT: running pass (11) Dead Argument Elimination on module (ld-temp.o)
BISECT: running pass (12) Function Integration/Inlining on SCC (<<null
function>>)
BISECT: running pass (13) Remove unused exception handling info on SCC (<<null
function>>)
BISECT: running pass (14) Global Variable Optimizer on module (ld-temp.o)
BISECT: running pass (15) Dead Global Elimination on module (ld-temp.o)
BISECT: running pass (16) Promote 'by reference' arguments to scalars on SCC
(<<null function>>)
BISECT: running pass (17) Deduce function attributes on SCC (<<null function>>)
BISECT: running pass (18) Eliminate Available Externally Globals on module (ld-
temp.o)
BISECT: running pass (19) Dead Global Elimination on module (ld-temp.o)
BISECT: running pass (BISECT: 20running pass ) (Whole program
devirtualization21 on ) module (2.o)Whole program devirtualization
 on module (1.o)BISECT:
running pass BISECT: (running pass 22() 23Infer set function attributes)  on
Infer set function attributesmodule (2.o) on
module (1.o)BISECT:
running pass BISECT: (running pass 24() 25Interprocedural Sparse Conditional
Constant Propagation)  on Interprocedural Sparse Conditional Constant
Propagationmodule (2.o) on
module (1.o)BISECT:
running pass BISECT: (running pass 26() 27Called Value Propagation)  on Called
Value Propagationmodule (2.o) on
module (1.o)
BISECT: BISECT: running pass running pass ((2829) ) Global Variable
OptimizerGlobal Variable Optimizer on  on module (2.o)module (1.o)

BISECT: BISECT: running pass running pass ((3031) ) Promote Memory to
RegisterPromote Memory to Register on  on function (_Z1fi)function (main)

BISECT: BISECT: running pass running pass ((3233) ) Dead Argument
EliminationPromote Memory to Register on  on module (2.o)function (_Z1fi)

BISECT: BISECT: running pass running pass ((3435) ) Combine redundant
instructionsDead Argument Elimination on  on function (_Z1fi)module (1.o)

BISECT: BISECT: running pass running pass ((3637) ) Simplify the CFGCombine
redundant instructions on  on function (_Z1fi)function (main)

BISECT: running pass (38) Simplify the CFG on function (main)
BISECT: running pass (39) Remove unused exception handling info on SCC (_Z1fi)
BISECT: BISECT: running pass running pass ((4041) ) Function
Integration/InliningCombine redundant instructions on  on SCC (_Z1fi)function
(_Z1fi)

BISECT: BISECT: running pass running pass ((4243) ) Deduce function
attributesSimplify the CFG on  on SCC (_Z1fi)function (_Z1fi)

BISECT: BISECT: running pass running pass ((4445) ) Remove unused exception
handling infoSROA on  on SCC (_Z1fi)function (_Z1fi)
<snip>

Quuxplusone commented 6 years ago

This is going to be true of any option that causes output from the passes after the Thin Link (another example is -pass-remarks), since ThinLTO uses threads to get parallel execution. I'm not really sure what the solution could be, other than having certain options (like maybe -opt-bisect-limit) force the number of ThinLTO backend threads to 1 (which gives you non-interleaved output at the expense of longer LTO link time). Buffering the output has other issues IMO.

Quuxplusone commented 6 years ago

Prefixing log lines with some identifier of the thread or process might be useful, if that's feasible. Then you can grep/filter the log to get what happened in each backend thread separately.

Quuxplusone / LLVMBugzillaTest

-opt-bisect-limit output is written out of order when performing ThinLTO #38152