Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Scheduling takes a very long time #8658

Closed Quuxplusone closed 14 years ago

Quuxplusone commented 14 years ago
Bugzilla Link PR8287
Status RESOLVED FIXED
Importance P normal
Reported by KS Sreeram (sreeram@tachyontech.net)
Reported on 2010-10-03 11:56:36 -0700
Last modified on 2010-11-12 11:58:12 -0800
Version trunk
Hardware Macintosh All
CC anton@korobeynikov.info, atrick@apple.com, evan.cheng@apple.com, llvm-bugs@lists.llvm.org
Fixed by commit(s)
Attachments hello.ll (88063 bytes, application/octet-stream)
Blocks
Blocked by
See also
Created attachment 5557
hello world program.

Running llc on the attached file takes a very long time, about 15 seconds on a
Core i7 MacBookPro.
This is on x86-64, but i've observed similar results on x86-32 as well.

Here's the output from "time llc -time-passes hello.ll":

$ time llc -time-passes hello.ll
===-------------------------------------------------------------------------===
                      Instruction Selection and Scheduling
===-------------------------------------------------------------------------===
  Total Execution Time: 13.3722 seconds (13.3728 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  10.8450 ( 86.4%)   0.7995 ( 98.0%)  11.6444 ( 87.1%)  11.6449 ( 87.1%)  Instruction Scheduling
   0.7267 (  5.8%)   0.0011 (  0.1%)   0.7278 (  5.4%)   0.7279 (  5.4%)  DAG Combining 1
   0.4943 (  3.9%)   0.0051 (  0.6%)   0.4995 (  3.7%)   0.4995 (  3.7%)  Instruction Selection
   0.3998 (  3.2%)   0.0010 (  0.1%)   0.4008 (  3.0%)   0.4009 (  3.0%)  DAG Legalization
   0.0267 (  0.2%)   0.0022 (  0.3%)   0.0290 (  0.2%)   0.0290 (  0.2%)  Vector Legalization
   0.0218 (  0.2%)   0.0044 (  0.5%)   0.0262 (  0.2%)   0.0262 (  0.2%)  Instruction Creation
   0.0230 (  0.2%)   0.0002 (  0.0%)   0.0232 (  0.2%)   0.0232 (  0.2%)  DAG Combining 2
   0.0180 (  0.1%)   0.0002 (  0.0%)   0.0182 (  0.1%)   0.0182 (  0.1%)  Type Legalization
   0.0011 (  0.0%)   0.0018 (  0.2%)   0.0030 (  0.0%)   0.0030 (  0.0%)  Instruction Scheduling Cleanup
   0.0002 (  0.0%)   0.0000 (  0.0%)   0.0002 (  0.0%)   0.0002 (  0.0%)  DAG Combining after legalize types
  12.5565 (100.0%)   0.8157 (100.0%)  13.3722 (100.0%)  13.3728 (100.0%)  Total

===-------------------------------------------------------------------------===
                                 DWARF Emission
===-------------------------------------------------------------------------===
  Total Execution Time: 0.0009 seconds (0.0009 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   0.0006 ( 89.9%)   0.0003 ( 83.0%)   0.0008 ( 87.6%)   0.0008 ( 88.2%)  DWARF Exception Writer
   0.0001 ( 10.1%)   0.0001 ( 17.0%)   0.0001 ( 12.4%)   0.0001 ( 11.8%)  DWARF Debug Writer
   0.0006 (100.0%)   0.0003 (100.0%)   0.0009 (100.0%)   0.0009 (100.0%)  Total

===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...
===-------------------------------------------------------------------------===
  Total Execution Time: 15.6307 seconds (15.6314 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  12.7550 ( 87.3%)   0.9641 ( 94.4%)  13.7191 ( 87.8%)  13.7198 ( 87.8%)  X86 DAG->DAG Instruction Selection
   0.9878 (  6.8%)   0.0018 (  0.2%)   0.9896 (  6.3%)   0.9896 (  6.3%)  Stack Slot Coloring
   0.7672 (  5.3%)   0.0136 (  1.3%)   0.7808 (  5.0%)   0.7808 (  5.0%)  Linear Scan Register Allocator
   0.0097 (  0.1%)   0.0299 (  2.9%)   0.0396 (  0.3%)   0.0396 (  0.3%)  Machine Function Analysis
   0.0168 (  0.1%)   0.0031 (  0.3%)   0.0199 (  0.1%)   0.0199 (  0.1%)  X86 AT&T-Style Assembly Printer
   0.0163 (  0.1%)   0.0014 (  0.1%)   0.0176 (  0.1%)   0.0176 (  0.1%)  Live Variable Analysis
   0.0128 (  0.1%)   0.0010 (  0.1%)   0.0138 (  0.1%)   0.0138 (  0.1%)  Live Interval Analysis
   0.0092 (  0.1%)   0.0001 (  0.0%)   0.0093 (  0.1%)   0.0093 (  0.1%)  Calculate spill weights
   0.0038 (  0.0%)   0.0016 (  0.2%)   0.0054 (  0.0%)   0.0054 (  0.0%)  Live Stack Slot Analysis
   0.0038 (  0.0%)   0.0001 (  0.0%)   0.0038 (  0.0%)   0.0038 (  0.0%)  Simple Register Coalescing
   0.0034 (  0.0%)   0.0004 (  0.0%)   0.0037 (  0.0%)   0.0037 (  0.0%)  Two-Address instruction pass
   0.0033 (  0.0%)   0.0001 (  0.0%)   0.0034 (  0.0%)   0.0034 (  0.0%)  Prolog/Epilog Insertion & Frame Finalization
   0.0025 (  0.0%)   0.0005 (  0.0%)   0.0029 (  0.0%)   0.0029 (  0.0%)  Slot index numbering
   0.0016 (  0.0%)   0.0004 (  0.0%)   0.0020 (  0.0%)   0.0020 (  0.0%)  Virtual Register Map
   0.0014 (  0.0%)   0.0002 (  0.0%)   0.0016 (  0.0%)   0.0017 (  0.0%)  Peephole Optimizations
   0.0016 (  0.0%)   0.0001 (  0.0%)   0.0016 (  0.0%)   0.0016 (  0.0%)  Remove dead machine instructions
   0.0013 (  0.0%)   0.0000 (  0.0%)   0.0013 (  0.0%)   0.0013 (  0.0%)  X86 FP Stackifier
   0.0005 (  0.0%)   0.0007 (  0.1%)   0.0012 (  0.0%)   0.0012 (  0.0%)  Basic Alias Analysis (default AA impl)
   0.0012 (  0.0%)   0.0001 (  0.0%)   0.0012 (  0.0%)   0.0012 (  0.0%)  Control Flow Optimizer
   0.0011 (  0.0%)   0.0000 (  0.0%)   0.0012 (  0.0%)   0.0012 (  0.0%)  Process Implicit Definitions.
   0.0007 (  0.0%)   0.0004 (  0.0%)   0.0011 (  0.0%)   0.0011 (  0.0%)  MachineDominator Tree Construction
   0.0008 (  0.0%)   0.0000 (  0.0%)   0.0008 (  0.0%)   0.0008 (  0.0%)  Machine Common Subexpression Elimination
   0.0004 (  0.0%)   0.0004 (  0.0%)   0.0008 (  0.0%)   0.0008 (  0.0%)  Machine Natural Loop Construction
   0.0007 (  0.0%)   0.0000 (  0.0%)   0.0007 (  0.0%)   0.0007 (  0.0%)  Dominator Tree Construction
   0.0007 (  0.0%)   0.0000 (  0.0%)   0.0007 (  0.0%)   0.0007 (  0.0%)  Subregister lowering instruction pass
   0.0005 (  0.0%)   0.0001 (  0.0%)   0.0007 (  0.0%)   0.0007 (  0.0%)  MachineDominator Tree Construction
   0.0006 (  0.0%)   0.0001 (  0.0%)   0.0006 (  0.0%)   0.0006 (  0.0%)  Optimize for code generation
   0.0006 (  0.0%)   0.0001 (  0.0%)   0.0006 (  0.0%)   0.0006 (  0.0%)  Module Verifier
   0.0005 (  0.0%)   0.0000 (  0.0%)   0.0005 (  0.0%)   0.0005 (  0.0%)  SSE execution domain fixup
   0.0005 (  0.0%)   0.0000 (  0.0%)   0.0005 (  0.0%)   0.0005 (  0.0%)  Dominator Tree Construction
   0.0004 (  0.0%)   0.0000 (  0.0%)   0.0004 (  0.0%)   0.0004 (  0.0%)  Dominator Tree Construction
   0.0003 (  0.0%)   0.0000 (  0.0%)   0.0004 (  0.0%)   0.0004 (  0.0%)  Module Verifier
   0.0002 (  0.0%)   0.0001 (  0.0%)   0.0003 (  0.0%)   0.0003 (  0.0%)  Machine Natural Loop Construction
   0.0002 (  0.0%)   0.0000 (  0.0%)   0.0003 (  0.0%)   0.0003 (  0.0%)  Machine Instruction LICM
   0.0002 (  0.0%)   0.0000 (  0.0%)   0.0003 (  0.0%)   0.0003 (  0.0%)  Natural Loop Information
   0.0002 (  0.0%)   0.0000 (  0.0%)   0.0002 (  0.0%)   0.0002 (  0.0%)  MachineDominator Tree Construction
   0.0002 (  0.0%)   0.0000 (  0.0%)   0.0002 (  0.0%)   0.0002 (  0.0%)  Remove unreachable blocks from the CFG
   0.0002 (  0.0%)   0.0000 (  0.0%)   0.0002 (  0.0%)   0.0002 (  0.0%)  Machine Instruction LICM
   0.0001 (  0.0%)   0.0001 (  0.0%)   0.0002 (  0.0%)   0.0002 (  0.0%)  Eliminate PHI nodes for register allocation
   0.0001 (  0.0%)   0.0000 (  0.0%)   0.0002 (  0.0%)   0.0002 (  0.0%)  Remove unreachable machine basic blocks
   0.0001 (  0.0%)   0.0000 (  0.0%)   0.0002 (  0.0%)   0.0002 (  0.0%)  Machine code sinking
   0.0001 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)   0.0001 (  0.0%)  Exception handling preparation
   0.0001 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)   0.0001 (  0.0%)  Tail Duplication
   0.0001 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)   0.0001 (  0.0%)  Preliminary module verification
   0.0001 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)   0.0001 (  0.0%)  Machine Natural Loop Construction
   0.0001 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)   0.0001 (  0.0%)  Scalar Evolution Analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)   0.0001 (  0.0%)  X86 Maximal Stack Alignment Check
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)   0.0001 (  0.0%)  Tail Duplication
   0.0001 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)   0.0001 (  0.0%)  Machine Natural Loop Construction
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)   0.0001 (  0.0%)  Post RA top-down list latency scheduler
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)   0.0001 (  0.0%)  Code Placement Optimizater
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)   0.0001 (  0.0%)  Optimize machine instruction PHIs
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)   0.0001 (  0.0%)  Local Stack Slot Allocation
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)   0.0001 (  0.0%)  Insert stack protectors
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)   0.0001 (  0.0%)  Lower Garbage Collection Instructions
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)   0.0001 (  0.0%)  Analyze Machine Code For Garbage Collection
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)   0.0001 (  0.0%)  Preliminary module verification
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)   0.0001 (  0.0%)  Delete Garbage Collector Information
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Canonicalize natural loops
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Create Garbage Collector Module Metadata
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Canonicalize natural loops
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Loop Strength Reduction
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Induction Variable Users
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Induction Variable Users
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Machine Module Information
  14.6093 (100.0%)   1.0214 (100.0%)  15.6307 (100.0%)  15.6314 (100.0%)  Total

real    0m15.691s
user    0m14.638s
sys 0m1.039s
Quuxplusone commented 14 years ago

Attached hello.ll (88063 bytes, application/octet-stream): hello world program.

Quuxplusone commented 14 years ago

There is a much larger issue here in the frontend. If you can reproduce anything close to this with clang or llvm-gcc, please file a new bug with the C source (the IR test case of this bug is the output produced by a frontend bug).

Meanwhile, I did commit fix r118904 and test case r118906 for the compile time issue.