Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Bitcode file with a large flat function takes a long time to compile with llc -O3 #32293

Open Quuxplusone opened 7 years ago

Quuxplusone commented 7 years ago
Bugzilla Link PR33321
Status NEW
Importance P normal
Reported by Chris Schafmeister (chris.schaf@verizon.net)
Reported on 2017-06-05 21:52:36 -0700
Last modified on 2019-03-22 03:46:51 -0700
Version 4.0
Hardware Macintosh MacOS X
CC ditaliano@apple.com, florian_hahn@apple.com, llvm-bugs@lists.llvm.org
Fixed by commit(s)
Attachments biglists20000.bc.zip (995375 bytes, application/zip)
Blocks
Blocked by
See also
My Common Lisp compiler (github.com/drmeister/clasp) generates a RUN-ALL
function for generating Common Lisp objects at startup.  The time it takes
clang to compile this function scales roughly as N^3 where N is the number of
instructions.

Use llc -O3 biglist20000.bc

This builds a 20,000 entry list
It takes more than 2 minutes on my machine.
With 7,000 entries it takes 7 seconds.
Quuxplusone commented 7 years ago

Attached biglists20000.bc.zip (995375 bytes, application/zip): Bitcode file

Quuxplusone commented 7 years ago

_Bug 33320 has been marked as a duplicate of this bug._

Quuxplusone commented 7 years ago

Are you able to profile and see where the time is spent? I recommend passing -time-passes to llc and see what tells you.

Quuxplusone commented 5 years ago
Majority of the time is spent in Local Splitting

===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...
===-------------------------------------------------------------------------===
  Total Execution Time: 2085.7452 seconds (2189.4576 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  1972.5722 ( 95.3%)  12.1684 ( 80.6%)  1984.7406 ( 95.2%)  2087.3912 ( 95.3%)  Greedy Register Allocator
  82.7289 (  4.0%)   0.8171 (  5.4%)  83.5460 (  4.0%)  84.4669 (  3.9%)  Machine Instruction Scheduler
   5.4599 (  0.3%)   1.8749 ( 12.4%)   7.3348 (  0.4%)   7.4621 (  0.3%)  X86 Assembly Printer
   5.2984 (  0.3%)   0.0151 (  0.1%)   5.3134 (  0.3%)   5.3259 (  0.2%)  Stack Slot Coloring
   2.0682 (  0.1%)   0.1150 (  0.8%)   2.1833 (  0.1%)   2.1835 (  0.1%)  X86 DAG->DAG Instruction Selection
   0.4669 (  0.0%)   0.0103 (  0.1%)   0.4772 (  0.0%)   0.4773 (  0.0%)  Live Variable Analysis
   0.1772 (  0.0%)   0.0024 (  0.0%)   0.1796 (  0.0%)   0.1796 (  0.0%)  Simple Register Coalescing

===-------------------------------------------------------------------------===
                              Register Allocation
===-------------------------------------------------------------------------===
  Total Execution Time: 1873.7156 seconds (1970.9389 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  1861.3112 ( 99.9%)  11.1987 ( 98.1%)  1872.5099 ( 99.9%)  1969.5353 ( 99.9%)  Local Splitting
   0.6046 (  0.0%)   0.1823 (  1.6%)   0.7869 (  0.0%)   0.9582 (  0.0%)  Evict
   0.3730 (  0.0%)   0.0369 (  0.3%)   0.4099 (  0.0%)   0.4366 (  0.0%)  Spiller
   0.0087 (  0.0%)   0.0001 (  0.0%)   0.0088 (  0.0%)   0.0088 (  0.0%)  Seed Live Regs
  1862.2976 (100.0%)  11.4180 (100.0%)  1873.7156 (100.0%)  1970.9389 (100.0%)  Total