Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Slow compilation when compiling a function which contains many always_inline function calls in it #33718

Open Quuxplusone opened 7 years ago

Quuxplusone commented 7 years ago
Bugzilla Link PR34746
Status NEW
Importance P enhancement
Reported by Teodor Petrov (fuscated@gmail.com)
Reported on 2017-09-27 08:44:23 -0700
Last modified on 2017-09-27 10:02:41 -0700
Version 5.0
Hardware PC Linux
CC dblaikie@gmail.com, ditaliano@apple.com, llvm-bugs@lists.llvm.org
Fixed by commit(s)
Attachments clang_slow_append.cpp (27415 bytes, text/x-c++src)
Blocks
Blocked by
See also
Created attachment 19206
Source file used to reproduce the problem

I've extracted the attached file from our code base. Clang is the only compiler
we use that is slow for this files (we use msvc++, intel and gcc 4.8 on various
OSes).

To reproduce the bug execute:
clang++ clang_slow_append.cpp -std=c++11 -O3  -ftime-report

The result from the time report is:
===-------------------------------------------------------------------------===
                         Miscellaneous Ungrouped Timers
===-------------------------------------------------------------------------===

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  12.6961 (100.0%)   0.0650 (100.0%)  12.7611 (100.0%)  12.7708 (100.0%)  Code Generation Time
   0.0020 (  0.0%)   0.0000 (  0.0%)   0.0020 (  0.0%)   0.0025 (  0.0%)  LLVM IR Generation Time
  12.6981 (100.0%)   0.0650 (100.0%)  12.7631 (100.0%)  12.7732 (100.0%)  Total

===-------------------------------------------------------------------------===
                              Register Allocation
===-------------------------------------------------------------------------===
  Total Execution Time: 0.0070 seconds (0.0043 wall clock)

   ---User Time---   --User+System--   ---Wall Time---  --- Name ---
   0.0030 ( 42.9%)   0.0030 ( 42.9%)   0.0025 ( 57.7%)  Seed Live Regs
   0.0040 ( 57.1%)   0.0040 ( 57.1%)   0.0018 ( 42.3%)  Evict
   0.0070 (100.0%)   0.0070 (100.0%)   0.0043 (100.0%)  Total

===-------------------------------------------------------------------------===
                      Instruction Selection and Scheduling
===-------------------------------------------------------------------------===
  Total Execution Time: 0.8019 seconds (0.7632 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   0.1580 ( 20.1%)   0.0030 ( 18.8%)   0.1610 ( 20.1%)   0.1549 ( 20.3%)  Instruction Selection
   0.1290 ( 16.4%)   0.0030 ( 18.7%)   0.1320 ( 16.5%)   0.1416 ( 18.5%)  Instruction Scheduling
   0.1170 ( 14.9%)   0.0030 ( 18.7%)   0.1200 ( 15.0%)   0.1224 ( 16.0%)  DAG Combining 1
   0.0850 ( 10.8%)   0.0010 (  6.3%)   0.0860 ( 10.7%)   0.0720 (  9.4%)  DAG Combining 2
   0.0730 (  9.3%)   0.0030 ( 18.7%)   0.0760 (  9.5%)   0.0663 (  8.7%)  Instruction Creation
   0.0770 (  9.8%)   0.0010 (  6.3%)   0.0780 (  9.7%)   0.0662 (  8.7%)  DAG Combining after legalize types
   0.0630 (  8.0%)   0.0010 (  6.3%)   0.0640 (  8.0%)   0.0633 (  8.3%)  DAG Legalization
   0.0570 (  7.3%)   0.0010 (  6.2%)   0.0580 (  7.2%)   0.0494 (  6.5%)  Type Legalization
   0.0160 (  2.0%)   0.0000 (  0.0%)   0.0160 (  2.0%)   0.0159 (  2.1%)  Vector Legalization
   0.0110 (  1.4%)   0.0000 (  0.0%)   0.0110 (  1.4%)   0.0112 (  1.5%)  Instruction Scheduling Cleanup
   0.7859 (100.0%)   0.0160 (100.0%)   0.8019 (100.0%)   0.7632 (100.0%)  Total

===-------------------------------------------------------------------------===
                                 DWARF Emission
===-------------------------------------------------------------------------===
  Total Execution Time: 0.0000 seconds (0.0001 wall clock)

   ---Wall Time---  --- Name ---
   0.0000 ( 73.3%)  Debug Info Emission
   0.0000 ( 23.0%)  DWARF Exception Writer
   0.0000 (  3.6%)  DWARF Debug Writer
   0.0001 (100.0%)  Total

===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...
===-------------------------------------------------------------------------===
  Total Execution Time: 12.7041 seconds (12.7088 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   5.1662 ( 40.9%)   0.0090 ( 14.8%)   5.1752 ( 40.7%)   5.1773 ( 40.7%)  Unroll loops
   1.7977 ( 14.2%)   0.0090 ( 14.8%)   1.8067 ( 14.2%)   1.8151 ( 14.3%)  Loop Strength Reduction
   1.1068 (  8.8%)   0.0250 ( 41.0%)   1.1318 (  8.9%)   1.1315 (  8.9%)  X86 DAG->DAG Instruction Selection
   1.1258 (  8.9%)   0.0030 (  4.9%)   1.1288 (  8.9%)   1.1221 (  8.8%)  Induction Variable Users
   0.5319 (  4.2%)   0.0000 (  0.0%)   0.5319 (  4.2%)   0.5324 (  4.2%)  CodeGen Prepare
   0.3409 (  2.7%)   0.0000 (  0.0%)   0.3409 (  2.7%)   0.3415 (  2.7%)  Combine redundant instructions
   0.2950 (  2.3%)   0.0000 (  0.0%)   0.2950 (  2.3%)   0.2953 (  2.3%)  Eliminate PHI nodes for register allocation
   0.2570 (  2.0%)   0.0000 (  0.0%)   0.2570 (  2.0%)   0.2565 (  2.0%)  Branch Probability Basic Block Placement
   0.2230 (  1.8%)   0.0010 (  1.6%)   0.2240 (  1.8%)   0.2243 (  1.8%)  Global Value Numbering
   0.1340 (  1.1%)   0.0030 (  4.9%)   0.1370 (  1.1%)   0.1371 (  1.1%)  Loop Load Elimination
   0.1190 (  0.9%)   0.0000 (  0.0%)   0.1190 (  0.9%)   0.1207 (  0.9%)  Loop Invariant Code Motion
   0.1110 (  0.9%)   0.0010 (  1.6%)   0.1120 (  0.9%)   0.1121 (  0.9%)  Loop Vectorization
   0.0950 (  0.8%)   0.0000 (  0.0%)   0.0950 (  0.7%)   0.0952 (  0.7%)  Simple Register Coalescing
   0.0900 (  0.7%)   0.0010 (  1.6%)   0.0910 (  0.7%)   0.0921 (  0.7%)  Induction Variable Simplification
   0.0850 (  0.7%)   0.0000 (  0.0%)   0.0850 (  0.7%)   0.0852 (  0.7%)  Machine Instruction Scheduler
   0.0770 (  0.6%)   0.0000 (  0.0%)   0.0770 (  0.6%)   0.0766 (  0.6%)  Combine redundant instructions
   0.0500 (  0.4%)   0.0000 (  0.0%)   0.0500 (  0.4%)   0.0491 (  0.4%)  Combine redundant instructions
   0.0450 (  0.4%)   0.0000 (  0.0%)   0.0450 (  0.4%)   0.0446 (  0.4%)  Greedy Register Allocator
   0.0410 (  0.3%)   0.0000 (  0.0%)   0.0410 (  0.3%)   0.0413 (  0.3%)  Combine redundant instructions
   0.0410 (  0.3%)   0.0000 (  0.0%)   0.0410 (  0.3%)   0.0398 (  0.3%)  Combine redundant instructions
   0.0400 (  0.3%)   0.0000 (  0.0%)   0.0400 (  0.3%)   0.0394 (  0.3%)  Live Interval Analysis
   0.0310 (  0.2%)   0.0020 (  3.3%)   0.0330 (  0.3%)   0.0313 (  0.2%)  Function Integration/Inlining
   0.0300 (  0.2%)   0.0000 (  0.0%)   0.0300 (  0.2%)   0.0307 (  0.2%)  Remove redundant instructions
   0.0290 (  0.2%)   0.0000 (  0.0%)   0.0290 (  0.2%)   0.0304 (  0.2%)  Live Variable Analysis
   0.0270 (  0.2%)   0.0000 (  0.0%)   0.0270 (  0.2%)   0.0263 (  0.2%)  Machine Common Subexpression Elimination
   0.0250 (  0.2%)   0.0000 (  0.0%)   0.0250 (  0.2%)   0.0251 (  0.2%)  Combine redundant instructions
   0.0240 (  0.2%)   0.0000 (  0.0%)   0.0240 (  0.2%)   0.0231 (  0.2%)  Machine code sinking
   0.0210 (  0.2%)   0.0000 (  0.0%)   0.0210 (  0.2%)   0.0216 (  0.2%)  Scalar Evolution Analysis
   0.0220 (  0.2%)   0.0000 (  0.0%)   0.0220 (  0.2%)   0.0215 (  0.2%)  Combine redundant instructions
   0.0190 (  0.2%)   0.0020 (  3.3%)   0.0210 (  0.2%)   0.0209 (  0.2%)  Value Propagation
   0.0200 (  0.2%)   0.0000 (  0.0%)   0.0200 (  0.2%)   0.0205 (  0.2%)  SLP Vectorizer
   0.0190 (  0.2%)   0.0000 (  0.0%)   0.0190 (  0.1%)   0.0199 (  0.2%)  Value Propagation
   0.0200 (  0.2%)   0.0000 (  0.0%)   0.0200 (  0.2%)   0.0198 (  0.2%)  X86 Assembly / Object Emitter
   0.0160 (  0.1%)   0.0000 (  0.0%)   0.0160 (  0.1%)   0.0159 (  0.1%)  Virtual Register Rewriter
   0.0160 (  0.1%)   0.0000 (  0.0%)   0.0160 (  0.1%)   0.0157 (  0.1%)  Jump Threading
   0.0140 (  0.1%)   0.0000 (  0.0%)   0.0140 (  0.1%)   0.0147 (  0.1%)  Machine Loop Invariant Code Motion
   0.0150 (  0.1%)   0.0000 (  0.0%)   0.0150 (  0.1%)   0.0143 (  0.1%)  X86 Byte/Word Instruction Fixup
   0.0130 (  0.1%)   0.0000 (  0.0%)   0.0130 (  0.1%)   0.0131 (  0.1%)  Two-Address instruction pass
   0.0130 (  0.1%)   0.0000 (  0.0%)   0.0130 (  0.1%)   0.0127 (  0.1%)  Control Flow Optimizer
   0.0120 (  0.1%)   0.0000 (  0.0%)   0.0120 (  0.1%)   0.0124 (  0.1%)  Scalar Evolution Analysis
   0.0110 (  0.1%)   0.0000 (  0.0%)   0.0110 (  0.1%)   0.0110 (  0.1%)  Simplify the CFG
   0.0110 (  0.1%)   0.0000 (  0.0%)   0.0110 (  0.1%)   0.0107 (  0.1%)  Slot index numbering
   0.0110 (  0.1%)   0.0000 (  0.0%)   0.0110 (  0.1%)   0.0100 (  0.1%)  Early CSE
   0.0090 (  0.1%)   0.0000 (  0.0%)   0.0090 (  0.1%)   0.0090 (  0.1%)  Dominator Tree Construction
   0.0080 (  0.1%)   0.0000 (  0.0%)   0.0080 (  0.1%)   0.0089 (  0.1%)  Dominator Tree Construction
   0.0090 (  0.1%)   0.0000 (  0.0%)   0.0090 (  0.1%)   0.0088 (  0.1%)  Execution dependency fix
   0.0080 (  0.1%)   0.0000 (  0.0%)   0.0080 (  0.1%)   0.0088 (  0.1%)  Branch Probability Analysis
   0.0090 (  0.1%)   0.0000 (  0.0%)   0.0090 (  0.1%)   0.0087 (  0.1%)  Canonicalize natural loops
   0.0080 (  0.1%)   0.0000 (  0.0%)   0.0080 (  0.1%)   0.0085 (  0.1%)  MachineDominator Tree Construction
   0.0080 (  0.1%)   0.0000 (  0.0%)   0.0080 (  0.1%)   0.0081 (  0.1%)  Dominator Tree Construction
   0.0080 (  0.1%)   0.0000 (  0.0%)   0.0080 (  0.1%)   0.0081 (  0.1%)  MachinePostDominator Tree Construction
   0.0080 (  0.1%)   0.0000 (  0.0%)   0.0080 (  0.1%)   0.0081 (  0.1%)  Sparse Conditional Constant Propagation
   0.0080 (  0.1%)   0.0000 (  0.0%)   0.0080 (  0.1%)   0.0078 (  0.1%)  Simplify the CFG
   0.0110 (  0.1%)   0.0000 (  0.0%)   0.0110 (  0.1%)   0.0077 (  0.1%)  Recognize loop idioms
   0.0070 (  0.1%)   0.0000 (  0.0%)   0.0070 (  0.1%)   0.0076 (  0.1%)  Jump Threading
   0.0080 (  0.1%)   0.0000 (  0.0%)   0.0080 (  0.1%)   0.0075 (  0.1%)  Live DEBUG_VALUE analysis
   0.0080 (  0.1%)   0.0000 (  0.0%)   0.0080 (  0.1%)   0.0075 (  0.1%)  MachineDominator Tree Construction
   0.0060 (  0.0%)   0.0010 (  1.6%)   0.0070 (  0.1%)   0.0074 (  0.1%)  Loop Invariant Code Motion
   0.0070 (  0.1%)   0.0000 (  0.0%)   0.0070 (  0.1%)   0.0073 (  0.1%)  MachineDominator Tree Construction
   0.0070 (  0.1%)   0.0000 (  0.0%)   0.0070 (  0.1%)   0.0073 (  0.1%)  Slot index numbering
   0.0070 (  0.1%)   0.0000 (  0.0%)   0.0070 (  0.1%)   0.0072 (  0.1%)  Peephole Optimizations
   0.0070 (  0.1%)   0.0000 (  0.0%)   0.0070 (  0.1%)   0.0072 (  0.1%)  Simplify the CFG
   0.0070 (  0.1%)   0.0000 (  0.0%)   0.0070 (  0.1%)   0.0072 (  0.1%)  MachinePostDominator Tree Construction
   0.0070 (  0.1%)   0.0000 (  0.0%)   0.0070 (  0.1%)   0.0071 (  0.1%)  Dominator Tree Construction
   0.0070 (  0.1%)   0.0000 (  0.0%)   0.0070 (  0.1%)   0.0071 (  0.1%)  Constant Hoisting
   0.0070 (  0.1%)   0.0000 (  0.0%)   0.0070 (  0.1%)   0.0070 (  0.1%)  MachineDominator Tree Construction
   0.0070 (  0.1%)   0.0000 (  0.0%)   0.0070 (  0.1%)   0.0069 (  0.1%)  MachineDominator Tree Construction
   0.0060 (  0.0%)   0.0000 (  0.0%)   0.0060 (  0.0%)   0.0069 (  0.1%)  Dominator Tree Construction
   0.0070 (  0.1%)   0.0000 (  0.0%)   0.0070 (  0.1%)   0.0068 (  0.1%)  Dominator Tree Construction
   0.0070 (  0.1%)   0.0000 (  0.0%)   0.0070 (  0.1%)   0.0066 (  0.1%)  Natural Loop Information
   0.0030 (  0.0%)   0.0000 (  0.0%)   0.0030 (  0.0%)   0.0066 (  0.1%)  Unroll loops
   0.0060 (  0.0%)   0.0000 (  0.0%)   0.0060 (  0.0%)   0.0065 (  0.1%)  Loop Invariant Code Motion
   0.0050 (  0.0%)   0.0000 (  0.0%)   0.0050 (  0.0%)   0.0065 (  0.1%)  Dominator Tree Construction
   0.0060 (  0.0%)   0.0000 (  0.0%)   0.0060 (  0.0%)   0.0060 (  0.0%)  Prologue/Epilogue Insertion & Frame Finalization
   0.0070 (  0.1%)   0.0000 (  0.0%)   0.0070 (  0.1%)   0.0060 (  0.0%)  Machine Copy Propagation Pass
   0.0060 (  0.0%)   0.0000 (  0.0%)   0.0060 (  0.0%)   0.0058 (  0.0%)  Remove dead machine instructions
   0.0060 (  0.0%)   0.0000 (  0.0%)   0.0060 (  0.0%)   0.0054 (  0.0%)  Insert stack protectors
   0.0050 (  0.0%)   0.0000 (  0.0%)   0.0050 (  0.0%)   0.0054 (  0.0%)  Machine Block Frequency Analysis
   0.0060 (  0.0%)   0.0000 (  0.0%)   0.0060 (  0.0%)   0.0054 (  0.0%)  Remove unreachable blocks from the CFG
   0.0050 (  0.0%)   0.0000 (  0.0%)   0.0050 (  0.0%)   0.0052 (  0.0%)  Natural Loop Information
   0.0050 (  0.0%)   0.0000 (  0.0%)   0.0050 (  0.0%)   0.0044 (  0.0%)  Dominator Tree Construction
   0.0040 (  0.0%)   0.0000 (  0.0%)   0.0040 (  0.0%)   0.0044 (  0.0%)  Remove dead machine instructions
   0.0040 (  0.0%)   0.0000 (  0.0%)   0.0040 (  0.0%)   0.0043 (  0.0%)  Bit-Tracking Dead Code Elimination
   0.0040 (  0.0%)   0.0000 (  0.0%)   0.0040 (  0.0%)   0.0043 (  0.0%)  Machine Block Frequency Analysis
   0.0040 (  0.0%)   0.0000 (  0.0%)   0.0040 (  0.0%)   0.0043 (  0.0%)  Machine Function Analysis
   0.0050 (  0.0%)   0.0000 (  0.0%)   0.0050 (  0.0%)   0.0040 (  0.0%)  Natural Loop Information
   0.0040 (  0.0%)   0.0000 (  0.0%)   0.0040 (  0.0%)   0.0039 (  0.0%)  Machine Block Frequency Analysis
   0.0040 (  0.0%)   0.0000 (  0.0%)   0.0040 (  0.0%)   0.0039 (  0.0%)  Reassociate expressions
   0.0040 (  0.0%)   0.0000 (  0.0%)   0.0040 (  0.0%)   0.0039 (  0.0%)  Loop-Closed SSA Form Pass
   0.0030 (  0.0%)   0.0000 (  0.0%)   0.0030 (  0.0%)   0.0039 (  0.0%)  Canonicalize natural loops
   0.0040 (  0.0%)   0.0000 (  0.0%)   0.0040 (  0.0%)   0.0039 (  0.0%)  Natural Loop Information
   0.0050 (  0.0%)   0.0010 (  1.6%)   0.0060 (  0.0%)   0.0037 (  0.0%)  Natural Loop Information
   0.0030 (  0.0%)   0.0000 (  0.0%)   0.0030 (  0.0%)   0.0037 (  0.0%)  Machine Block Frequency Analysis
   0.0050 (  0.0%)   0.0000 (  0.0%)   0.0050 (  0.0%)   0.0036 (  0.0%)  Natural Loop Information
   0.0030 (  0.0%)   0.0000 (  0.0%)   0.0030 (  0.0%)   0.0035 (  0.0%)  Canonicalize natural loops
   0.0030 (  0.0%)   0.0000 (  0.0%)   0.0030 (  0.0%)   0.0032 (  0.0%)  Aggressive Dead Code Elimination
   0.0020 (  0.0%)   0.0000 (  0.0%)   0.0020 (  0.0%)   0.0032 (  0.0%)  Machine Natural Loop Construction
   0.0030 (  0.0%)   0.0000 (  0.0%)   0.0030 (  0.0%)   0.0032 (  0.0%)  Partially inline calls to library functions
   0.0030 (  0.0%)   0.0000 (  0.0%)   0.0030 (  0.0%)   0.0031 (  0.0%)  Simplify the CFG
   0.0030 (  0.0%)   0.0000 (  0.0%)   0.0030 (  0.0%)   0.0031 (  0.0%)  Natural Loop Information
   0.0030 (  0.0%)   0.0000 (  0.0%)   0.0030 (  0.0%)   0.0031 (  0.0%)  Simplify the CFG
   0.0030 (  0.0%)   0.0000 (  0.0%)   0.0030 (  0.0%)   0.0030 (  0.0%)  Machine Loop Invariant Code Motion
   0.0030 (  0.0%)   0.0000 (  0.0%)   0.0030 (  0.0%)   0.0029 (  0.0%)  Dominator Tree Construction
   0.0030 (  0.0%)   0.0000 (  0.0%)   0.0030 (  0.0%)   0.0029 (  0.0%)  X86 LEA Optimize
   0.0030 (  0.0%)   0.0000 (  0.0%)   0.0030 (  0.0%)   0.0028 (  0.0%)  Dominator Tree Construction
   0.0030 (  0.0%)   0.0000 (  0.0%)   0.0030 (  0.0%)   0.0028 (  0.0%)  Dead Store Elimination
   0.0030 (  0.0%)   0.0000 (  0.0%)   0.0030 (  0.0%)   0.0028 (  0.0%)  Dominator Tree Construction
   0.0030 (  0.0%)   0.0000 (  0.0%)   0.0030 (  0.0%)   0.0028 (  0.0%)  Machine Natural Loop Construction
   0.0030 (  0.0%)   0.0000 (  0.0%)   0.0030 (  0.0%)   0.0027 (  0.0%)  Dead Global Elimination
   0.0020 (  0.0%)   0.0000 (  0.0%)   0.0020 (  0.0%)   0.0027 (  0.0%)  Machine Natural Loop Construction
   0.0020 (  0.0%)   0.0000 (  0.0%)   0.0020 (  0.0%)   0.0027 (  0.0%)  Dominator Tree Construction
   0.0020 (  0.0%)   0.0000 (  0.0%)   0.0020 (  0.0%)   0.0027 (  0.0%)  Dominator Tree Construction
   0.0030 (  0.0%)   0.0000 (  0.0%)   0.0030 (  0.0%)   0.0027 (  0.0%)  Expand Atomic instructions
   0.0020 (  0.0%)   0.0000 (  0.0%)   0.0020 (  0.0%)   0.0026 (  0.0%)  Dominator Tree Construction
   0.0020 (  0.0%)   0.0000 (  0.0%)   0.0020 (  0.0%)   0.0026 (  0.0%)  Dominator Tree Construction
   0.0020 (  0.0%)   0.0000 (  0.0%)   0.0020 (  0.0%)   0.0025 (  0.0%)  Expand ISel Pseudo-instructions
   0.0030 (  0.0%)   0.0000 (  0.0%)   0.0030 (  0.0%)   0.0025 (  0.0%)  Machine Natural Loop Construction
   0.0030 (  0.0%)   0.0000 (  0.0%)   0.0030 (  0.0%)   0.0025 (  0.0%)  Tail Duplication
   0.0040 (  0.0%)   0.0000 (  0.0%)   0.0040 (  0.0%)   0.0024 (  0.0%)  Unswitch loops
   0.0020 (  0.0%)   0.0000 (  0.0%)   0.0020 (  0.0%)   0.0024 (  0.0%)  Dominator Tree Construction
   0.0030 (  0.0%)   0.0000 (  0.0%)   0.0030 (  0.0%)   0.0023 (  0.0%)  Machine InstCombiner
   0.0020 (  0.0%)   0.0000 (  0.0%)   0.0020 (  0.0%)   0.0023 (  0.0%)  Remove unreachable machine basic blocks
   0.0020 (  0.0%)   0.0000 (  0.0%)   0.0020 (  0.0%)   0.0022 (  0.0%)  Canonicalize natural loops
   0.0020 (  0.0%)   0.0000 (  0.0%)   0.0020 (  0.0%)   0.0021 (  0.0%)  Block Frequency Analysis
   0.0020 (  0.0%)   0.0000 (  0.0%)   0.0020 (  0.0%)   0.0020 (  0.0%)  Post-RA pseudo instruction expansion pass
   0.0010 (  0.0%)   0.0000 (  0.0%)   0.0010 (  0.0%)   0.0019 (  0.0%)  Natural Loop Information
   0.0010 (  0.0%)   0.0000 (  0.0%)   0.0010 (  0.0%)   0.0019 (  0.0%)  Canonicalize natural loops
   0.0020 (  0.0%)   0.0000 (  0.0%)   0.0020 (  0.0%)   0.0019 (  0.0%)  Canonicalize natural loops
   0.0020 (  0.0%)   0.0000 (  0.0%)   0.0020 (  0.0%)   0.0018 (  0.0%)  Branch Probability Analysis
   0.0020 (  0.0%)   0.0000 (  0.0%)   0.0020 (  0.0%)   0.0018 (  0.0%)  Canonicalize natural loops
   0.0020 (  0.0%)   0.0000 (  0.0%)   0.0020 (  0.0%)   0.0018 (  0.0%)  X86 Optimize Call Frame
   0.0030 (  0.0%)   0.0000 (  0.0%)   0.0030 (  0.0%)   0.0018 (  0.0%)  Natural Loop Information
   0.0040 (  0.0%)   0.0000 (  0.0%)   0.0040 (  0.0%)   0.0018 (  0.0%)  Rotate Loops
   0.0020 (  0.0%)   0.0000 (  0.0%)   0.0020 (  0.0%)   0.0017 (  0.0%)  Branch Probability Analysis
   0.0020 (  0.0%)   0.0000 (  0.0%)   0.0020 (  0.0%)   0.0017 (  0.0%)  Canonicalize natural loops
   0.0010 (  0.0%)   0.0010 (  1.6%)   0.0020 (  0.0%)   0.0016 (  0.0%)  Delete dead loops
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0015 (  0.0%)  Rotate Loops
   0.0010 (  0.0%)   0.0010 (  1.6%)   0.0020 (  0.0%)   0.0015 (  0.0%)  Debug Variable Analysis
   0.0010 (  0.0%)   0.0000 (  0.0%)   0.0010 (  0.0%)   0.0015 (  0.0%)  X86 LEA Fixup
   0.0010 (  0.0%)   0.0000 (  0.0%)   0.0010 (  0.0%)   0.0015 (  0.0%)  Loop Access Analysis
   0.0010 (  0.0%)   0.0000 (  0.0%)   0.0010 (  0.0%)   0.0013 (  0.0%)  Tail Call Elimination
   0.0020 (  0.0%)   0.0000 (  0.0%)   0.0020 (  0.0%)   0.0013 (  0.0%)  Scalar Evolution Analysis
   0.0010 (  0.0%)   0.0000 (  0.0%)   0.0010 (  0.0%)   0.0013 (  0.0%)  Optimize machine instruction PHIs
   0.0020 (  0.0%)   0.0000 (  0.0%)   0.0020 (  0.0%)   0.0013 (  0.0%)  MemCpy Optimization
   0.0010 (  0.0%)   0.0000 (  0.0%)   0.0010 (  0.0%)   0.0011 (  0.0%)  Combine redundant instructions
   0.0010 (  0.0%)   0.0000 (  0.0%)   0.0010 (  0.0%)   0.0010 (  0.0%)  X86 Fixup SetCC
   0.0010 (  0.0%)   0.0000 (  0.0%)   0.0010 (  0.0%)   0.0010 (  0.0%)  Scalar Evolution Analysis
   0.0010 (  0.0%)   0.0000 (  0.0%)   0.0010 (  0.0%)   0.0010 (  0.0%)  Bundle Machine CFG Edges
   0.0010 (  0.0%)   0.0000 (  0.0%)   0.0010 (  0.0%)   0.0009 (  0.0%)  X86 pseudo instruction expansion pass
   0.0010 (  0.0%)   0.0000 (  0.0%)   0.0010 (  0.0%)   0.0009 (  0.0%)  Loop Access Analysis
   0.0010 (  0.0%)   0.0000 (  0.0%)   0.0010 (  0.0%)   0.0009 (  0.0%)  Loop-Closed SSA Form Pass
   0.0000 (  0.0%)   0.0010 (  1.6%)   0.0010 (  0.0%)   0.0008 (  0.0%)  Lazy Value Information Analysis
   0.0010 (  0.0%)   0.0000 (  0.0%)   0.0010 (  0.0%)   0.0007 (  0.0%)  Scalar Evolution Analysis
   0.0010 (  0.0%)   0.0000 (  0.0%)   0.0010 (  0.0%)   0.0007 (  0.0%)  Process Implicit Definitions
   0.0010 (  0.0%)   0.0000 (  0.0%)   0.0010 (  0.0%)   0.0007 (  0.0%)  Bundle Machine CFG Edges
   0.0010 (  0.0%)   0.0000 (  0.0%)   0.0010 (  0.0%)   0.0007 (  0.0%)  Loop-Closed SSA Form Pass
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0006 (  0.0%)  Loop-Closed SSA Form Pass
   0.0010 (  0.0%)   0.0000 (  0.0%)   0.0010 (  0.0%)   0.0006 (  0.0%)  Tail Duplication
   0.0010 (  0.0%)   0.0000 (  0.0%)   0.0010 (  0.0%)   0.0006 (  0.0%)  Lazy Value Information Analysis
   0.0010 (  0.0%)   0.0000 (  0.0%)   0.0010 (  0.0%)   0.0006 (  0.0%)  Loop-Closed SSA Form Pass
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0005 (  0.0%)  Loop-Closed SSA Form Pass
   0.0010 (  0.0%)   0.0000 (  0.0%)   0.0010 (  0.0%)   0.0005 (  0.0%)  Scalar Evolution Analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0005 (  0.0%)  SROA
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0004 (  0.0%)  Spill Code Placement Analysis
   0.0010 (  0.0%)   0.0000 (  0.0%)   0.0010 (  0.0%)   0.0004 (  0.0%)  Exception handling preparation
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0004 (  0.0%)  Live Register Matrix
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0004 (  0.0%)  MergedLoadStoreMotion
   0.0010 (  0.0%)   0.0000 (  0.0%)   0.0010 (  0.0%)   0.0004 (  0.0%)  Simplify the CFG
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0003 (  0.0%)  CallGraph Construction
   0.0010 (  0.0%)   0.0000 (  0.0%)   0.0010 (  0.0%)   0.0003 (  0.0%)  Canonicalize natural loops
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0003 (  0.0%)  Early CSE
   0.0010 (  0.0%)   0.0000 (  0.0%)   0.0010 (  0.0%)   0.0002 (  0.0%)  Deduce function attributes
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0002 (  0.0%)  Loop Distribition
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0002 (  0.0%)  Loop-Closed SSA Form Pass
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0002 (  0.0%)  Memory Dependence Analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0002 (  0.0%)  Float to int
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)  Interprocedural Sparse Conditional Constant Propagation
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)  CallGraph Construction
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)  Virtual Register Map
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)  Simplify the CFG
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)  Demanded bits analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)  Machine Trace Metrics
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)  Remove unused exception handling info
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Globals Alias Analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Global Variable Optimizer
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Dominator Tree Construction
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  PGOIndirectCallPromotion
   0.0010 (  0.0%)   0.0000 (  0.0%)   0.0010 (  0.0%)   0.0000 (  0.0%)  Deduce function attributes in RPO
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Globals Alias Analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Scalar Evolution Analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Merge disjoint stack slots
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Scalar Evolution Analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Scalar Evolution Analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Dominator Tree Construction
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Shrink Wrapping analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Dominator Tree Construction
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Promote Memory to Register
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Lower 'expect' Intrinsics
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Scalar Evolution Analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Promote 'by reference' arguments to scalars
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Scalar Evolution Analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Scalar Evolution Analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Post RA top-down list latency scheduler
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Dead Argument Elimination
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Function Alias Analysis Results
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Function Alias Analysis Results
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Alignment from assumptions
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Infer set function attributes
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  SROA
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Function Alias Analysis Results
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  X86 PIC Global Base Reg Initialization
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Function Alias Analysis Results
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Function Alias Analysis Results
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Function Alias Analysis Results
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Function Alias Analysis Results
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Contiguously Lay Out Funclets
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Memory Dependence Analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Safe Stack instrumentation pass
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Function Alias Analysis Results
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Function Alias Analysis Results
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Speculatively execute instructions if target has divergent branches
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Function Alias Analysis Results
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Add DWARF path discriminators
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Memory Dependence Analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Function Alias Analysis Results
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Function Alias Analysis Results
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Function Alias Analysis Results
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Rename Disconnected Subregister Components
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Function Alias Analysis Results
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  X86 WinAlloca Expander
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Stack Slot Coloring
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Local Dynamic TLS Access Clean-up
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Function Alias Analysis Results
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Function Alias Analysis Results
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Local Stack Slot Allocation
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Early If-Conversion
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Function Alias Analysis Results
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Loop Access Analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Live Stack Slot Analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Insert XRay ops
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Demanded bits analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  StackMap Liveness Analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Implement the 'patchable-function' attribute
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Optimization Remark Emitter
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Demanded bits analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  X86 FP Stackifier
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Lazy Block Frequency Analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Analyze Machine Code For Garbage Collection
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  ObjC ARC contraction
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  X86 Atom pad short functions
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Shadow Stack GC Lowering
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Lower Garbage Collection Instructions
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Eliminate Available Externally Globals
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Assumption Cache Tracker
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  X86 vzeroupper inserter
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Detect Dead Lanes
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Basic Alias Analysis (stateless AA impl)
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Target Transform Information
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Merge Duplicate Global Constants
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Force set function attributes
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Assumption Cache Tracker
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Type-Based Alias Analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Strip Unused Function Prototypes
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Scoped NoAlias Alias Analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Pre-ISel Intrinsic Lowering
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Type-Based Alias Analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Target Transform Information
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Target Pass Configuration
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Target Library Information
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Target Library Information
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Scoped NoAlias Alias Analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Rewrite Symbols
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Profile summary info
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Machine Module Information
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Machine Branch Probability Analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Create Garbage Collector Module Metadata
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  A No-Op Barrier Pass
  12.6431 (100.0%)   0.0610 (100.0%)  12.7041 (100.0%)  12.7088 (100.0%)  Total

===-------------------------------------------------------------------------===
                          Clang front-end time report
===-------------------------------------------------------------------------===
  Total Execution Time: 12.7741 seconds (12.7838 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  12.7051 (100.0%)   0.0690 (100.0%)  12.7741 (100.0%)  12.7838 (100.0%)  Clang front-end timer
  12.7051 (100.0%)   0.0690 (100.0%)  12.7741 (100.0%)  12.7838 (100.0%)  Total

If you want to make it really (production) slow pass -DFULL on the command line.

$ time clang++-3.9 clang_slow_append.cpp -std=c++11 -O3 -DFULL

real    0m54.826s
user    0m54.651s
sys 0m0.127s

And here is the output of GCC 4.8.2
$ time g++-4.8 clang_slow_append.cpp -std=c++11 -O3 -DFULL

real    0m3.408s
user    0m3.341s
sys 0m0.062s

I've tried the smaller example on godbolt.org, but it timed out. This is why
I'm setting the version to 5.x.x.

The machine is i7 5820, CentOS 6 linux. I've compiled clang using GCC 4.8.
Quuxplusone commented 7 years ago

Attached clang_slow_append.cpp (27415 bytes, text/x-c++src): Source file used to reproduce the problem