Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

dsputil.c from ffmpeg takes > 13s to compile with -O (1.3s without) #11774

Open Quuxplusone opened 12 years ago

Quuxplusone commented 12 years ago
Bugzilla Link PR12521
Status NEW
Importance P enhancement
Reported by Nico Weber (nicolasweber@gmx.de)
Reported on 2012-04-10 12:20:28 -0700
Last modified on 2012-04-10 23:03:03 -0700
Version unspecified
Hardware PC All
CC geek4civic@gmail.com, llvm-bugs@lists.llvm.org
Fixed by commit(s)
Attachments dsputil.i (758637 bytes, application/octet-stream)
Blocks
Blocked by
See also
Created attachment 8370
repro

Attached is a preprocessed version of ffmpeg's dsputil.c. The file is
comparably small (700kB, 11k lines), but takes a long time to codegen:

hummer:src thakis$ time third_party/llvm-build/Release+Asserts/bin/clang -c
~/dsputil.i

real    0m1.985s
user    0m1.619s
sys 0m0.082s

hummer:src thakis$ time third_party/llvm-build/Release+Asserts/bin/clang -c -O
~/dsputil.i

real    0m13.894s
user    0m11.112s
sys 0m0.141s

hummer:src thakis$ time third_party/llvm-build/Release+Asserts/bin/clang -c -O
~/dsputil.i -fsyntax-only

real    0m0.512s
user    0m0.458s
sys 0m0.025s

This is with clang r153589.
Quuxplusone commented 12 years ago

Attached dsputil.i (758637 bytes, application/octet-stream): repro

Quuxplusone commented 12 years ago
hummer:src thakis$ time third_party/llvm-build/Release+Asserts/bin/clang -c -O
~/dsputil.i -ftime-report
===-------------------------------------------------------------------------===
                              Register Allocation
===-------------------------------------------------------------------------===
  Total Execution Time: 0.2342 seconds (0.2347 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   0.0450 ( 21.4%)   0.0139 ( 59.4%)   0.0590 ( 25.2%)   0.0596 ( 25.4%)  Evict
   0.0501 ( 23.8%)   0.0015 (  6.4%)   0.0516 ( 22.0%)   0.0518 ( 22.1%)  Global Splitting
   0.0418 ( 19.8%)   0.0021 (  8.8%)   0.0439 ( 18.7%)   0.0435 ( 18.5%)  Spiller
   0.0246 ( 11.7%)   0.0022 (  9.2%)   0.0267 ( 11.4%)   0.0268 ( 11.4%)  Local Splitting
   0.0191 (  9.0%)   0.0009 (  3.8%)   0.0200 (  8.5%)   0.0199 (  8.5%)  Rewriter
   0.0181 (  8.6%)   0.0009 (  3.8%)   0.0189 (  8.1%)   0.0190 (  8.1%)  Seed Live Regs
   0.0090 (  4.3%)   0.0008 (  3.2%)   0.0098 (  4.2%)   0.0098 (  4.2%)  MBB Live Ins
   0.0023 (  1.1%)   0.0007 (  3.0%)   0.0030 (  1.3%)   0.0029 (  1.3%)  Initialize
   0.0008 (  0.4%)   0.0006 (  2.4%)   0.0014 (  0.6%)   0.0014 (  0.6%)  Emit Debug Info
   0.2108 (100.0%)   0.0234 (100.0%)   0.2342 (100.0%)   0.2347 (100.0%)  Total

===-------------------------------------------------------------------------===
                      Instruction Selection and Scheduling
===-------------------------------------------------------------------------===
  Total Execution Time: 1.7783 seconds (1.7787 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   0.3621 ( 20.8%)   0.0082 ( 20.6%)   0.3703 ( 20.8%)   0.3720 ( 20.9%)  DAG Combining 1
   0.3280 ( 18.9%)   0.0056 ( 14.2%)   0.3336 ( 18.8%)   0.3339 ( 18.8%)  Instruction Scheduling
   0.2998 ( 17.2%)   0.0039 (  9.9%)   0.3037 ( 17.1%)   0.3032 ( 17.0%)  Instruction Selection
   0.2094 ( 12.0%)   0.0034 (  8.5%)   0.2128 ( 12.0%)   0.2135 ( 12.0%)  DAG Combining 2
   0.1569 (  9.0%)   0.0016 (  4.0%)   0.1584 (  8.9%)   0.1582 (  8.9%)  DAG Combining after legalize types
   0.1364 (  7.8%)   0.0042 ( 10.4%)   0.1406 (  7.9%)   0.1401 (  7.9%)  Instruction Creation
   0.0932 (  5.4%)   0.0032 (  7.9%)   0.0964 (  5.4%)   0.0960 (  5.4%)  DAG Legalization
   0.0697 (  4.0%)   0.0033 (  8.3%)   0.0730 (  4.1%)   0.0726 (  4.1%)  Type Legalization
   0.0654 (  3.8%)   0.0033 (  8.3%)   0.0687 (  3.9%)   0.0685 (  3.8%)  Vector Legalization
   0.0175 (  1.0%)   0.0032 (  8.0%)   0.0206 (  1.2%)   0.0209 (  1.2%)  Instruction Scheduling Cleanup
   1.7384 (100.0%)   0.0399 (100.0%)   1.7783 (100.0%)   1.7787 (100.0%)  Total

===-------------------------------------------------------------------------===
                                 DWARF Emission
===-------------------------------------------------------------------------===
  Total Execution Time: 0.0211 seconds (0.0214 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   0.0099 ( 57.2%)   0.0021 ( 55.5%)   0.0120 ( 56.9%)   0.0123 ( 57.5%)  DWARF Debug Writer
   0.0074 ( 42.8%)   0.0017 ( 44.5%)   0.0091 ( 43.1%)   0.0091 ( 42.5%)  DWARF Exception Writer
   0.0173 (100.0%)   0.0037 (100.0%)   0.0211 (100.0%)   0.0214 (100.0%)  Total

===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...
===-------------------------------------------------------------------------===
  Total Execution Time: 10.1358 seconds (10.1495 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   2.2430 ( 23.3%)   0.0886 ( 17.4%)   2.3316 ( 23.0%)   2.3368 ( 23.0%)  X86 DAG->DAG Instruction Selection
   1.1068 ( 11.5%)   0.0054 (  1.1%)   1.1121 ( 11.0%)   1.1142 ( 11.0%)  Loop Strength Reduction
   0.4917 (  5.1%)   0.0024 (  0.5%)   0.4941 (  4.9%)   0.4954 (  4.9%)  Combine redundant instructions
   0.3760 (  3.9%)   0.0475 (  9.3%)   0.4235 (  4.2%)   0.4247 (  4.2%)  Greedy Register Allocator
   0.3712 (  3.9%)   0.0022 (  0.4%)   0.3733 (  3.7%)   0.3738 (  3.7%)  Combine redundant instructions
   0.3661 (  3.8%)   0.0027 (  0.5%)   0.3688 (  3.6%)   0.3701 (  3.6%)  Induction Variable Users
   0.3536 (  3.7%)   0.0023 (  0.4%)   0.3559 (  3.5%)   0.3564 (  3.5%)  Combine redundant instructions
   0.3361 (  3.5%)   0.0029 (  0.6%)   0.3391 (  3.3%)   0.3393 (  3.3%)  Induction Variable Simplification
   0.3245 (  3.4%)   0.0033 (  0.7%)   0.3279 (  3.2%)   0.3278 (  3.2%)  Global Value Numbering
   0.2540 (  2.6%)   0.0047 (  0.9%)   0.2587 (  2.6%)   0.2588 (  2.6%)  Function Integration/Inlining
   0.2571 (  2.7%)   0.0017 (  0.3%)   0.2588 (  2.6%)   0.2588 (  2.6%)  Combine redundant instructions
   0.2339 (  2.4%)   0.0018 (  0.4%)   0.2357 (  2.3%)   0.2359 (  2.3%)  Combine redundant instructions
   0.2008 (  2.1%)   0.0017 (  0.3%)   0.2025 (  2.0%)   0.2032 (  2.0%)  Dead Store Elimination
   0.1403 (  1.5%)   0.0038 (  0.8%)   0.1441 (  1.4%)   0.1440 (  1.4%)  Live Variable Analysis
   0.1279 (  1.3%)   0.0016 (  0.3%)   0.1295 (  1.3%)   0.1294 (  1.3%)  Simple Register Coalescing
   0.1151 (  1.2%)   0.0078 (  1.5%)   0.1229 (  1.2%)   0.1230 (  1.2%)  X86 AT&T-Style Assembly Printer
   0.0984 (  1.0%)   0.0014 (  0.3%)   0.0997 (  1.0%)   0.1003 (  1.0%)  Machine Common Subexpression Elimination
   0.0874 (  0.9%)   0.0057 (  1.1%)   0.0931 (  0.9%)   0.0930 (  0.9%)  Live Interval Analysis
   0.0913 (  0.9%)   0.0014 (  0.3%)   0.0927 (  0.9%)   0.0926 (  0.9%)  Optimize for code generation
   0.0661 (  0.7%)   0.0256 (  5.0%)   0.0917 (  0.9%)   0.0918 (  0.9%)  Machine Function Analysis
   0.0881 (  0.9%)   0.0014 (  0.3%)   0.0895 (  0.9%)   0.0893 (  0.9%)  Two-Address instruction pass
   0.0809 (  0.8%)   0.0015 (  0.3%)   0.0824 (  0.8%)   0.0824 (  0.8%)  MemCpy Optimization
   0.0563 (  0.6%)   0.0209 (  4.1%)   0.0772 (  0.8%)   0.0771 (  0.8%)  Natural Loop Information
   0.0693 (  0.7%)   0.0018 (  0.4%)   0.0711 (  0.7%)   0.0709 (  0.7%)  Loop Invariant Code Motion
   0.0650 (  0.7%)   0.0016 (  0.3%)   0.0666 (  0.7%)   0.0666 (  0.7%)  Early CSE
   0.0620 (  0.6%)   0.0015 (  0.3%)   0.0635 (  0.6%)   0.0635 (  0.6%)  Machine Loop Invariant Code Motion
   0.0612 (  0.6%)   0.0020 (  0.4%)   0.0632 (  0.6%)   0.0631 (  0.6%)  Recognize loop idioms
   0.0553 (  0.6%)   0.0025 (  0.5%)   0.0578 (  0.6%)   0.0584 (  0.6%)  Unroll loops
   0.0559 (  0.6%)   0.0016 (  0.3%)   0.0575 (  0.6%)   0.0575 (  0.6%)  Value Propagation
   0.0560 (  0.6%)   0.0014 (  0.3%)   0.0574 (  0.6%)   0.0572 (  0.6%)  Reassociate expressions
   0.0530 (  0.6%)   0.0030 (  0.6%)   0.0560 (  0.6%)   0.0560 (  0.6%)  Value Propagation
   0.0544 (  0.6%)   0.0012 (  0.2%)   0.0556 (  0.5%)   0.0555 (  0.5%)  Calculate spill weights
   0.0478 (  0.5%)   0.0012 (  0.2%)   0.0491 (  0.5%)   0.0491 (  0.5%)  Module Verifier
   0.0462 (  0.5%)   0.0025 (  0.5%)   0.0487 (  0.5%)   0.0487 (  0.5%)  Scalar Evolution Analysis
   0.0204 (  0.2%)   0.0276 (  5.4%)   0.0480 (  0.5%)   0.0484 (  0.5%)  Basic Alias Analysis (stateless AA impl)
   0.0461 (  0.5%)   0.0014 (  0.3%)   0.0475 (  0.5%)   0.0474 (  0.5%)  Sparse Conditional Constant Propagation
   0.0429 (  0.4%)   0.0013 (  0.3%)   0.0442 (  0.4%)   0.0446 (  0.4%)  Prologue/Epilogue Insertion & Frame Finalization
   0.0406 (  0.4%)   0.0015 (  0.3%)   0.0421 (  0.4%)   0.0419 (  0.4%)  Scalar Replacement of Aggregates (DT)
   0.0391 (  0.4%)   0.0014 (  0.3%)   0.0406 (  0.4%)   0.0404 (  0.4%)  Early CSE
   0.0370 (  0.4%)   0.0013 (  0.2%)   0.0383 (  0.4%)   0.0383 (  0.4%)  Module Verifier
   0.0233 (  0.2%)   0.0139 (  2.7%)   0.0372 (  0.4%)   0.0373 (  0.4%)  MachineDominator Tree Construction
   0.0332 (  0.3%)   0.0014 (  0.3%)   0.0346 (  0.3%)   0.0345 (  0.3%)  Module Verifier
   0.0327 (  0.3%)   0.0006 (  0.1%)   0.0333 (  0.3%)   0.0333 (  0.3%)  Interprocedural Sparse Conditional Constant Propagation
   0.0207 (  0.2%)   0.0126 (  2.5%)   0.0333 (  0.3%)   0.0332 (  0.3%)  Dominator Tree Construction
   0.0309 (  0.3%)   0.0012 (  0.2%)   0.0321 (  0.3%)   0.0320 (  0.3%)  Machine Copy Propagation Pass
   0.0289 (  0.3%)   0.0019 (  0.4%)   0.0308 (  0.3%)   0.0307 (  0.3%)  Rotate Loops
   0.0226 (  0.2%)   0.0078 (  1.5%)   0.0305 (  0.3%)   0.0304 (  0.3%)  Canonicalize natural loops
   0.0166 (  0.2%)   0.0133 (  2.6%)   0.0299 (  0.3%)   0.0299 (  0.3%)  Machine Natural Loop Construction
   0.0227 (  0.2%)   0.0057 (  1.1%)   0.0284 (  0.3%)   0.0284 (  0.3%)  Natural Loop Information
   0.0265 (  0.3%)   0.0013 (  0.3%)   0.0279 (  0.3%)   0.0279 (  0.3%)  Eliminate PHI nodes for register allocation
   0.0195 (  0.2%)   0.0058 (  1.1%)   0.0254 (  0.3%)   0.0254 (  0.3%)  Scalar Evolution Analysis
   0.0194 (  0.2%)   0.0049 (  1.0%)   0.0242 (  0.2%)   0.0248 (  0.2%)  Loop-Closed SSA Form Pass
   0.0227 (  0.2%)   0.0013 (  0.3%)   0.0240 (  0.2%)   0.0239 (  0.2%)  Aggressive Dead Code Elimination
   0.0215 (  0.2%)   0.0013 (  0.2%)   0.0227 (  0.2%)   0.0226 (  0.2%)  Control Flow Optimizer
   0.0155 (  0.2%)   0.0071 (  1.4%)   0.0225 (  0.2%)   0.0225 (  0.2%)  Canonicalize natural loops
   0.0175 (  0.2%)   0.0041 (  0.8%)   0.0216 (  0.2%)   0.0217 (  0.2%)  Slot index numbering
   0.0087 (  0.1%)   0.0119 (  2.3%)   0.0207 (  0.2%)   0.0209 (  0.2%)  No Alias Analysis (always returns 'may' alias)
   0.0191 (  0.2%)   0.0012 (  0.2%)   0.0203 (  0.2%)   0.0203 (  0.2%)  Machine code sinking
   0.0185 (  0.2%)   0.0014 (  0.3%)   0.0200 (  0.2%)   0.0198 (  0.2%)  Deduce function attributes
   0.0158 (  0.2%)   0.0031 (  0.6%)   0.0189 (  0.2%)   0.0189 (  0.2%)  Dominator Tree Construction
   0.0174 (  0.2%)   0.0012 (  0.2%)   0.0187 (  0.2%)   0.0186 (  0.2%)  Remove dead machine instructions
   0.0159 (  0.2%)   0.0014 (  0.3%)   0.0173 (  0.2%)   0.0173 (  0.2%)  Jump Threading
   0.0066 (  0.1%)   0.0094 (  1.8%)   0.0159 (  0.2%)   0.0162 (  0.2%)  No Alias Analysis (always returns 'may' alias)
   0.0148 (  0.2%)   0.0013 (  0.2%)   0.0161 (  0.2%)   0.0160 (  0.2%)  Process Implicit Definitions
   0.0137 (  0.1%)   0.0021 (  0.4%)   0.0158 (  0.2%)   0.0158 (  0.2%)  Lazy Value Information Analysis
   0.0141 (  0.1%)   0.0017 (  0.3%)   0.0159 (  0.2%)   0.0158 (  0.2%)  Dominator Tree Construction
   0.0132 (  0.1%)   0.0023 (  0.4%)   0.0155 (  0.2%)   0.0156 (  0.2%)  MachineDominator Tree Construction
   0.0127 (  0.1%)   0.0026 (  0.5%)   0.0153 (  0.2%)   0.0153 (  0.2%)  Dominator Tree Construction
   0.0130 (  0.1%)   0.0020 (  0.4%)   0.0150 (  0.1%)   0.0150 (  0.1%)  Dominator Tree Construction
   0.0124 (  0.1%)   0.0025 (  0.5%)   0.0149 (  0.1%)   0.0149 (  0.1%)  Loop-Closed SSA Form Pass
   0.0127 (  0.1%)   0.0020 (  0.4%)   0.0147 (  0.1%)   0.0146 (  0.1%)  Lazy Value Information Analysis
   0.0129 (  0.1%)   0.0012 (  0.2%)   0.0141 (  0.1%)   0.0140 (  0.1%)  Insert stack protectors
   0.0123 (  0.1%)   0.0013 (  0.2%)   0.0136 (  0.1%)   0.0136 (  0.1%)  Peephole Optimizations
   0.0120 (  0.1%)   0.0012 (  0.2%)   0.0132 (  0.1%)   0.0132 (  0.1%)  Execution dependency fix
   0.0113 (  0.1%)   0.0018 (  0.4%)   0.0131 (  0.1%)   0.0131 (  0.1%)  Unswitch loops
   0.0114 (  0.1%)   0.0014 (  0.3%)   0.0128 (  0.1%)   0.0129 (  0.1%)  Canonicalize natural loops
   0.0099 (  0.1%)   0.0024 (  0.5%)   0.0123 (  0.1%)   0.0122 (  0.1%)  Loop-Closed SSA Form Pass
   0.0106 (  0.1%)   0.0013 (  0.3%)   0.0119 (  0.1%)   0.0119 (  0.1%)  Simplify the CFG
   0.0103 (  0.1%)   0.0013 (  0.3%)   0.0116 (  0.1%)   0.0116 (  0.1%)  Simplify the CFG
   0.0099 (  0.1%)   0.0012 (  0.2%)   0.0112 (  0.1%)   0.0111 (  0.1%)  Machine Loop Invariant Code Motion
   0.0097 (  0.1%)   0.0014 (  0.3%)   0.0111 (  0.1%)   0.0110 (  0.1%)  Jump Threading
   0.0097 (  0.1%)   0.0013 (  0.3%)   0.0110 (  0.1%)   0.0110 (  0.1%)  Simplify the CFG
   0.0096 (  0.1%)   0.0012 (  0.2%)   0.0108 (  0.1%)   0.0108 (  0.1%)  Post-RA pseudo instruction expansion pass
   0.0088 (  0.1%)   0.0014 (  0.3%)   0.0101 (  0.1%)   0.0101 (  0.1%)  Simplify the CFG
   0.0042 (  0.0%)   0.0056 (  1.1%)   0.0098 (  0.1%)   0.0099 (  0.1%)  Basic Alias Analysis (stateless AA impl)
   0.0083 (  0.1%)   0.0017 (  0.3%)   0.0100 (  0.1%)   0.0099 (  0.1%)  Natural Loop Information
   0.0082 (  0.1%)   0.0016 (  0.3%)   0.0098 (  0.1%)   0.0098 (  0.1%)  Remove unused exception handling info
   0.0080 (  0.1%)   0.0014 (  0.3%)   0.0094 (  0.1%)   0.0093 (  0.1%)  Branch Probability Analysis
   0.0069 (  0.1%)   0.0025 (  0.5%)   0.0094 (  0.1%)   0.0093 (  0.1%)  Dominator Tree Construction
   0.0071 (  0.1%)   0.0022 (  0.4%)   0.0093 (  0.1%)   0.0093 (  0.1%)  Machine Natural Loop Construction
   0.0066 (  0.1%)   0.0023 (  0.5%)   0.0089 (  0.1%)   0.0089 (  0.1%)  Debug Variable Analysis
   0.0076 (  0.1%)   0.0012 (  0.2%)   0.0088 (  0.1%)   0.0088 (  0.1%)  Stack Slot Coloring
   0.0072 (  0.1%)   0.0013 (  0.3%)   0.0085 (  0.1%)   0.0084 (  0.1%)  Dominator Tree Construction
   0.0059 (  0.1%)   0.0020 (  0.4%)   0.0079 (  0.1%)   0.0080 (  0.1%)  Canonicalize natural loops
   0.0065 (  0.1%)   0.0013 (  0.2%)   0.0078 (  0.1%)   0.0077 (  0.1%)  Simplify the CFG
   0.0074 (  0.1%)   0.0001 (  0.0%)   0.0076 (  0.1%)   0.0076 (  0.1%)  Basic CallGraph Construction
   0.0061 (  0.1%)   0.0013 (  0.2%)   0.0073 (  0.1%)   0.0074 (  0.1%)  X86 FP Stackifier
   0.0057 (  0.1%)   0.0014 (  0.3%)   0.0071 (  0.1%)   0.0070 (  0.1%)  Scalar Replacement of Aggregates (SSAUp)
   0.0056 (  0.1%)   0.0012 (  0.2%)   0.0067 (  0.1%)   0.0067 (  0.1%)  Remove unreachable machine basic blocks
   0.0049 (  0.1%)   0.0013 (  0.2%)   0.0062 (  0.1%)   0.0062 (  0.1%)  Memory Dependence Analysis
   0.0049 (  0.1%)   0.0012 (  0.2%)   0.0061 (  0.1%)   0.0061 (  0.1%)  Remove unreachable blocks from the CFG
   0.0042 (  0.0%)   0.0018 (  0.4%)   0.0060 (  0.1%)   0.0060 (  0.1%)  Delete dead loops
   0.0038 (  0.0%)   0.0019 (  0.4%)   0.0057 (  0.1%)   0.0058 (  0.1%)  Memory Dependence Analysis
   0.0042 (  0.0%)   0.0013 (  0.3%)   0.0055 (  0.1%)   0.0055 (  0.1%)  Tail Call Elimination
   0.0035 (  0.0%)   0.0018 (  0.3%)   0.0052 (  0.1%)   0.0052 (  0.1%)  Spill Code Placement Analysis
   0.0037 (  0.0%)   0.0014 (  0.3%)   0.0050 (  0.0%)   0.0050 (  0.0%)  Simplify well-known library calls
   0.0037 (  0.0%)   0.0012 (  0.2%)   0.0049 (  0.0%)   0.0049 (  0.0%)  Code Placement Optimizer
   0.0031 (  0.0%)   0.0017 (  0.3%)   0.0049 (  0.0%)   0.0049 (  0.0%)  Bundle Machine CFG Edges
   0.0029 (  0.0%)   0.0019 (  0.4%)   0.0048 (  0.0%)   0.0048 (  0.0%)  Memory Dependence Analysis
   0.0035 (  0.0%)   0.0012 (  0.2%)   0.0047 (  0.0%)   0.0048 (  0.0%)  Tail Duplication
   0.0035 (  0.0%)   0.0012 (  0.2%)   0.0047 (  0.0%)   0.0047 (  0.0%)  Tail Duplication
   0.0030 (  0.0%)   0.0017 (  0.3%)   0.0047 (  0.0%)   0.0047 (  0.0%)  Live Stack Slot Analysis
   0.0032 (  0.0%)   0.0012 (  0.2%)   0.0045 (  0.0%)   0.0044 (  0.0%)  Post RA top-down list latency scheduler
   0.0026 (  0.0%)   0.0018 (  0.3%)   0.0044 (  0.0%)   0.0044 (  0.0%)  Virtual Register Map
   0.0044 (  0.0%)   0.0000 (  0.0%)   0.0044 (  0.0%)   0.0044 (  0.0%)  Dead Argument Elimination
   0.0031 (  0.0%)   0.0012 (  0.2%)   0.0042 (  0.0%)   0.0043 (  0.0%)  Optimize machine instruction PHIs
   0.0026 (  0.0%)   0.0017 (  0.3%)   0.0043 (  0.0%)   0.0043 (  0.0%)  Bundle Machine CFG Edges
   0.0026 (  0.0%)   0.0013 (  0.2%)   0.0039 (  0.0%)   0.0039 (  0.0%)  Expand ISel Pseudo-instructions
   0.0018 (  0.0%)   0.0012 (  0.2%)   0.0030 (  0.0%)   0.0031 (  0.0%)  Lower 'expect' Intrinsics
   0.0017 (  0.0%)   0.0012 (  0.2%)   0.0028 (  0.0%)   0.0029 (  0.0%)  Preliminary module verification
   0.0015 (  0.0%)   0.0012 (  0.2%)   0.0027 (  0.0%)   0.0028 (  0.0%)  Analyze Machine Code For Garbage Collection
   0.0014 (  0.0%)   0.0013 (  0.3%)   0.0027 (  0.0%)   0.0027 (  0.0%)  Preliminary module verification
   0.0014 (  0.0%)   0.0012 (  0.2%)   0.0026 (  0.0%)   0.0027 (  0.0%)  X86 Maximal Stack Alignment Check
   0.0014 (  0.0%)   0.0012 (  0.2%)   0.0026 (  0.0%)   0.0026 (  0.0%)  Local Stack Slot Allocation
   0.0013 (  0.0%)   0.0012 (  0.2%)   0.0026 (  0.0%)   0.0026 (  0.0%)  Lower Garbage Collection Instructions
   0.0013 (  0.0%)   0.0011 (  0.2%)   0.0024 (  0.0%)   0.0025 (  0.0%)  Exception handling preparation
   0.0011 (  0.0%)   0.0012 (  0.2%)   0.0023 (  0.0%)   0.0023 (  0.0%)  Preliminary module verification
   0.0010 (  0.0%)   0.0012 (  0.2%)   0.0022 (  0.0%)   0.0022 (  0.0%)  Delete Garbage Collector Information
   0.0009 (  0.0%)   0.0012 (  0.2%)   0.0021 (  0.0%)   0.0021 (  0.0%)  No Alias Analysis (always returns 'may' alias)
   0.0005 (  0.0%)   0.0006 (  0.1%)   0.0012 (  0.0%)   0.0011 (  0.0%)  Target Library Information
   0.0005 (  0.0%)   0.0006 (  0.1%)   0.0010 (  0.0%)   0.0010 (  0.0%)  Create Garbage Collector Module Metadata
   0.0007 (  0.0%)   0.0000 (  0.0%)   0.0008 (  0.0%)   0.0008 (  0.0%)  Global Variable Optimizer
   0.0001 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)   0.0001 (  0.0%)  Merge Duplicate Global Constants
   0.0001 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)   0.0001 (  0.0%)  Strip Unused Function Prototypes
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Machine Module Information
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Target Pass Configuration
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Target Library Information
   9.6260 (100.0%)   0.5098 (100.0%)  10.1358 (100.0%)  10.1495 (100.0%)  Total

===-------------------------------------------------------------------------===
                         Miscellaneous Ungrouped Timers
===-------------------------------------------------------------------------===

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  11.2798 ( 51.0%)   0.9474 ( 50.5%)  12.2272 ( 51.0%)  12.2615 ( 51.0%)  Clang front-end timer
  10.5605 ( 47.8%)   0.8937 ( 47.6%)  11.4542 ( 47.8%)  11.4861 ( 47.8%)  Code Generation Time
   0.2677 (  1.2%)   0.0366 (  2.0%)   0.3043 (  1.3%)   0.3061 (  1.3%)  LLVM IR Generation Time
  22.1080 (100.0%)   1.8777 (100.0%)  23.9857 (100.0%)  24.0537 (100.0%)  Total

real    0m12.308s
user    0m11.288s
sys 0m0.976s