Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Compilation of insn-attrtab.c Very Slow #1666

Closed Quuxplusone closed 16 years ago

Quuxplusone commented 16 years ago
Bugzilla Link PR1621
Status RESOLVED INVALID
Importance P normal
Reported by Bill Wendling (isanbard@gmail.com)
Reported on 2007-08-23 17:48:30 -0700
Last modified on 2007-08-23 18:31:51 -0700
Version trunk
Hardware All All
CC anton@korobeynikov.info, llvm-bugs@lists.llvm.org, resistor@mac.com
Fixed by commit(s)
Attachments testcase.i.gz (160365 bytes, text/plain)
Blocks
Blocked by
See also

The attached file takes >11 minutes to compile during a (debug) bootstrap of LLVM-GCC 4.2. It's the preprocessed output of the generated insn-attrtab.c file. GCC takes only a few minutes to compile it.

Quuxplusone commented 16 years ago

Attached testcase.i.gz (160365 bytes, text/plain): Preprocessed output of insn-attrtab.c

Quuxplusone commented 16 years ago

Compile with this:

$ llvm-gcc -g -O2 -mdynamic-no-pic -fno-common testcase.i

Quuxplusone commented 16 years ago

Yes, it seems, that checked build is much slower. Release is much faster. I really don't think this is a bug.

Quuxplusone commented 16 years ago

Alright, I'll go ahead and mark it as "invalid". It just seemed extra long to me. :-)

Quuxplusone commented 16 years ago

What does "opt -time-passes" or llvm-gcc -ftime-report say for the file? Any outliers?

Quuxplusone commented 16 years ago

If I remember correctly, one pass dominates. If I'm right - it was GCSE. I'll recheck, actually.

Quuxplusone commented 16 years ago

If so, Owen should take a look. Hopefully GCSE will die next week.

Quuxplusone commented 16 years ago
Here's what -ftime-report gives:

===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...
===-------------------------------------------------------------------------===
  Total Execution Time: 713.2354 seconds (822.7144 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  235.7063 ( 33.7%)   1.1100 (  7.2%)  236.8163 ( 33.2%)  274.5477 ( 33.3%)  Linear Scan Register Allocator
  181.3386 ( 25.9%)   0.8839 (  5.7%)  182.2226 ( 25.5%)  207.6889 ( 25.2%)  Global Common Subexpression Elimination
  61.9282 (  8.8%)  10.1908 ( 66.3%)  72.1191 ( 10.1%)  85.9039 ( 10.4%)  Simple Register Coalescing
  30.2452 (  4.3%)   0.9935 (  6.4%)  31.2387 (  4.3%)  36.1858 (  4.3%)  Break critical edges in CFG
  29.6865 (  4.2%)   0.1715 (  1.1%)  29.8581 (  4.1%)  35.2829 (  4.2%)  Simplify the CFG
  29.2335 (  4.1%)   0.9163 (  5.9%)  30.1499 (  4.2%)  33.7817 (  4.1%)  Break critical edges in CFG
  27.2445 (  3.9%)   0.1403 (  0.9%)  27.3848 (  3.8%)  30.4702 (  3.7%)  Simplify the CFG
  18.8636 (  2.7%)   0.1431 (  0.9%)  19.0068 (  2.6%)  22.7183 (  2.7%)  X86 DAG->DAG Instruction Selection
  19.2656 (  2.7%)   0.1013 (  0.6%)  19.3670 (  2.7%)  21.4663 (  2.6%)  Simplify the CFG
  12.7784 (  1.8%)   0.0827 (  0.5%)  12.8611 (  1.8%)  15.0353 (  1.8%)  Post-Dominator Tree Construction
  12.9369 (  1.8%)   0.0639 (  0.4%)  13.0008 (  1.8%)  14.5189 (  1.7%)  Control Flow Optimizer
   6.0786 (  0.8%)   0.1361 (  0.8%)   6.2147 (  0.8%)   7.5407 (  0.9%)  Live Interval Analysis
   5.6771 (  0.8%)   0.0494 (  0.3%)   5.7266 (  0.8%)   6.1483 (  0.7%)  Combine redundant instructions
   4.9721 (  0.7%)   0.0367 (  0.2%)   5.0088 (  0.7%)   5.7271 (  0.6%)  Optimize for code generation
   3.0945 (  0.4%)   0.0722 (  0.4%)   3.1667 (  0.4%)   3.8543 (  0.4%)  Live Variable Analysis
   2.0836 (  0.2%)   0.0097 (  0.0%)   2.0933 (  0.2%)   2.5026 (  0.3%)  Live Variable Analysis
   1.5289 (  0.2%)   0.0109 (  0.0%)   1.5399 (  0.2%)   1.7336 (  0.2%)  Aggressive Dead Code Elimination
   0.8338 (  0.1%)   0.0089 (  0.0%)   0.8428 (  0.1%)   1.0520 (  0.1%)  Combine redundant instructions
   0.7325 (  0.1%)   0.0085 (  0.0%)   0.7410 (  0.1%)   0.9187 (  0.1%)  Sparse Conditional Constant Propagation
   0.7795 (  0.1%)   0.0114 (  0.0%)   0.7910 (  0.1%)   0.8876 (  0.1%)  Combine redundant instructions
   0.7590 (  0.1%)   0.0028 (  0.0%)   0.7619 (  0.1%)   0.7772 (  0.0%)  Simplify the CFG
   0.6513 (  0.0%)   0.0063 (  0.0%)   0.6577 (  0.0%)   0.7586 (  0.0%)  Simplify the CFG
   0.6442 (  0.0%)   0.0035 (  0.0%)   0.6478 (  0.0%)   0.7002 (  0.0%)  Simplify the CFG
   0.5547 (  0.0%)   0.0088 (  0.0%)   0.5635 (  0.0%)   0.6857 (  0.0%)  Combine redundant instructions
   0.4752 (  0.0%)   0.0466 (  0.3%)   0.5219 (  0.0%)   0.6720 (  0.0%)  X86 AT&T-Style Assembly Printer
   0.5489 (  0.0%)   0.0063 (  0.0%)   0.5552 (  0.0%)   0.6372 (  0.0%)  Combine redundant instructions
   0.5493 (  0.0%)   0.0065 (  0.0%)   0.5558 (  0.0%)   0.6166 (  0.0%)  Combine redundant instructions
   0.4535 (  0.0%)   0.0023 (  0.0%)   0.4559 (  0.0%)   0.6112 (  0.0%)  Post-Dominance Frontier Construction
   0.5203 (  0.0%)   0.0068 (  0.0%)   0.5272 (  0.0%)   0.5720 (  0.0%)  Combine redundant instructions
   0.4846 (  0.0%)   0.0063 (  0.0%)   0.4909 (  0.0%)   0.5261 (  0.0%)  Scalar Replacement of Aggregates
   0.3469 (  0.0%)   0.0085 (  0.0%)   0.3554 (  0.0%)   0.5154 (  0.0%)  Dominator Tree Construction
   0.4110 (  0.0%)   0.0086 (  0.0%)   0.4196 (  0.0%)   0.4379 (  0.0%)  Dominator Tree Construction
   0.3732 (  0.0%)   0.0086 (  0.0%)   0.3818 (  0.0%)   0.4299 (  0.0%)  Dominator Tree Construction
   0.4118 (  0.0%)   0.0100 (  0.0%)   0.4218 (  0.0%)   0.4258 (  0.0%)  Dominator Tree Construction
   0.3731 (  0.0%)   0.0099 (  0.0%)   0.3830 (  0.0%)   0.4139 (  0.0%)  Dominator Tree Construction
   0.3509 (  0.0%)   0.0078 (  0.0%)   0.3587 (  0.0%)   0.4051 (  0.0%)  Dominator Tree Construction
   0.3435 (  0.0%)   0.0094 (  0.0%)   0.3530 (  0.0%)   0.4044 (  0.0%)  Dominator Tree Construction
   0.3675 (  0.0%)   0.0083 (  0.0%)   0.3759 (  0.0%)   0.3996 (  0.0%)  Dominator Tree Construction
   0.3639 (  0.0%)   0.0075 (  0.0%)   0.3715 (  0.0%)   0.3786 (  0.0%)  Dominator Tree Construction
   0.3555 (  0.0%)   0.0008 (  0.0%)   0.3563 (  0.0%)   0.3631 (  0.0%)  Module Verifier
   0.3024 (  0.0%)   0.0018 (  0.0%)   0.3043 (  0.0%)   0.3316 (  0.0%)  Reassociate expressions
   0.2706 (  0.0%)   0.0017 (  0.0%)   0.2723 (  0.0%)   0.3049 (  0.0%)  Prolog/Epilog Insertion & Frame Finalization
   0.2509 (  0.0%)   0.0013 (  0.0%)   0.2523 (  0.0%)   0.2760 (  0.0%)  Dead Store Elimination
   0.2394 (  0.0%)   0.0053 (  0.0%)   0.2447 (  0.0%)   0.2584 (  0.0%)  Dominance Frontier Construction
   0.2105 (  0.0%)   0.0027 (  0.0%)   0.2132 (  0.0%)   0.2581 (  0.0%)  Dominance Frontier Construction
   0.2082 (  0.0%)   0.0018 (  0.0%)   0.2100 (  0.0%)   0.2309 (  0.0%)  Dominance Frontier Construction
   0.2018 (  0.0%)   0.0014 (  0.0%)   0.2033 (  0.0%)   0.2065 (  0.0%)  Dominance Frontier Construction
   0.1813 (  0.0%)   0.0011 (  0.0%)   0.1825 (  0.0%)   0.2063 (  0.0%)  Machine Code Deleter
   0.1272 (  0.0%)   0.0012 (  0.0%)   0.1285 (  0.0%)   0.1596 (  0.0%)  Natural Loop Construction
   0.1361 (  0.0%)   0.0015 (  0.0%)   0.1377 (  0.0%)   0.1545 (  0.0%)  Natural Loop Construction
   0.1271 (  0.0%)   0.0008 (  0.0%)   0.1280 (  0.0%)   0.1533 (  0.0%)  Conditional Propagation
   0.1285 (  0.0%)   0.0012 (  0.0%)   0.1298 (  0.0%)   0.1488 (  0.0%)  Natural Loop Construction
   0.1154 (  0.0%)   0.0006 (  0.0%)   0.1161 (  0.0%)   0.1460 (  0.0%)  Tail Call Elimination
   0.1284 (  0.0%)   0.0007 (  0.0%)   0.1291 (  0.0%)   0.1457 (  0.0%)  Conditional Propagation
   0.1259 (  0.0%)   0.0005 (  0.0%)   0.1265 (  0.0%)   0.1429 (  0.0%)  Remove unreachable blocks from the CFG
   0.1259 (  0.0%)   0.0012 (  0.0%)   0.1271 (  0.0%)   0.1387 (  0.0%)  Natural Loop Construction
   0.1144 (  0.0%)   0.0002 (  0.0%)   0.1147 (  0.0%)   0.1357 (  0.0%)  Dead Global Elimination
   0.1237 (  0.0%)   0.0004 (  0.0%)   0.1241 (  0.0%)   0.1311 (  0.0%)  X86 FP Stackifier
   0.0658 (  0.0%)   0.0003 (  0.0%)   0.0661 (  0.0%)   0.0815 (  0.0%)  Two-Address instruction pass
   0.0730 (  0.0%)   0.0004 (  0.0%)   0.0735 (  0.0%)   0.0797 (  0.0%)  Unify function exit nodes
   0.0690 (  0.0%)   0.0004 (  0.0%)   0.0695 (  0.0%)   0.0768 (  0.0%)  Remove unused exception handling info
   0.0357 (  0.0%)   0.0005 (  0.0%)   0.0363 (  0.0%)   0.0386 (  0.0%)  Basic CallGraph Construction
   0.0319 (  0.0%)   0.0002 (  0.0%)   0.0321 (  0.0%)   0.0383 (  0.0%)  Subregister lowering instruction pass
   0.0173 (  0.0%)   0.0002 (  0.0%)   0.0175 (  0.0%)   0.0307 (  0.0%)  Tail Duplication
   0.0220 (  0.0%)   0.0001 (  0.0%)   0.0222 (  0.0%)   0.0291 (  0.0%)  Label Folder
   0.0263 (  0.0%)   0.0001 (  0.0%)   0.0265 (  0.0%)   0.0271 (  0.0%)  Canonicalize natural loops
   0.0244 (  0.0%)   0.0001 (  0.0%)   0.0245 (  0.0%)   0.0249 (  0.0%)  Canonicalize natural loops
   0.0222 (  0.0%)   0.0001 (  0.0%)   0.0224 (  0.0%)   0.0232 (  0.0%)  Eliminate PHI nodes for register allocation
   0.0189 (  0.0%)   0.0001 (  0.0%)   0.0190 (  0.0%)   0.0223 (  0.0%)  Lower invoke and unwind, for unwindless code generators
   0.0033 (  0.0%)   0.0000 (  0.0%)   0.0033 (  0.0%)   0.0038 (  0.0%)  Global Variable Optimizer
   0.0030 (  0.0%)   0.0000 (  0.0%)   0.0030 (  0.0%)   0.0031 (  0.0%)  Merge Duplicate Global Constants
   0.0025 (  0.0%)   0.0000 (  0.0%)   0.0025 (  0.0%)   0.0025 (  0.0%)  Interprocedural constant propagation
   0.0005 (  0.0%)   0.0000 (  0.0%)   0.0005 (  0.0%)   0.0008 (  0.0%)  Simplify well-known library calls
   0.0005 (  0.0%)   0.0001 (  0.0%)   0.0006 (  0.0%)   0.0006 (  0.0%)  Scalar Evolution Analysis
   0.0004 (  0.0%)   0.0000 (  0.0%)   0.0005 (  0.0%)   0.0005 (  0.0%)  Scalar Replacement of Aggregates
   0.0004 (  0.0%)   0.0000 (  0.0%)   0.0005 (  0.0%)   0.0005 (  0.0%)  Dead Argument Elimination
   0.0003 (  0.0%)   0.0000 (  0.0%)   0.0004 (  0.0%)   0.0004 (  0.0%)  Scalar Evolution Analysis
   0.0002 (  0.0%)   0.0000 (  0.0%)   0.0003 (  0.0%)   0.0003 (  0.0%)  Promote Memory to Register
   0.0002 (  0.0%)   0.0000 (  0.0%)   0.0003 (  0.0%)   0.0003 (  0.0%)  Scalar Evolution Analysis
   0.0002 (  0.0%)   0.0000 (  0.0%)   0.0002 (  0.0%)   0.0002 (  0.0%)  Memory Dependence Analysis
   0.0002 (  0.0%)   0.0000 (  0.0%)   0.0002 (  0.0%)   0.0002 (  0.0%)  Scalar Evolution Analysis
   0.0001 (  0.0%)   0.0000 (  0.0%)   0.0002 (  0.0%)   0.0002 (  0.0%)  Load Value Numbering
   0.0001 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)   0.0001 (  0.0%)  Post RA top-down list latency scheduler (STUB)
   0.0001 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)   0.0001 (  0.0%)  Lower GC intrinsics, for GCless code generators
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Target Data Layout
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Strip Unused Function Prototypes
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Raise allocations from calls to instructions
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Target Data Layout
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Basic Alias Analysis (default AA impl)
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Basic Value Numbering (default GVN impl)
  697.8872 ( 99.9%)  15.3482 (100.0%)  713.2354 (100.0%)  822.7144 (100.0%)  TOTAL

Execution times (seconds)
 garbage collection    :   0.21 ( 0%) usr   0.00 ( 0%) sys   0.22 ( 0%) wall       0 kB ( 0%) ggc
 callgraph construction:   0.41 ( 0%) usr   0.03 ( 0%) sys   0.45 ( 0%) wall    7466 kB (12%) ggc
 callgraph optimization:   0.25 ( 0%) usr   0.00 ( 0%) sys   0.25 ( 0%) wall     703 kB ( 1%) ggc
 ipa reference         :   0.09 ( 0%) usr   0.02 ( 0%) sys   0.13 ( 0%) wall       0 kB ( 0%) ggc
 ipa type escape       :   0.11 ( 0%) usr   0.02 ( 0%) sys   0.17 ( 0%) wall       0 kB ( 0%) ggc
 CFG verifier          :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall       0 kB ( 0%) ggc
 preprocessing         :   0.89 ( 0%) usr   0.63 ( 3%) sys   1.71 ( 0%) wall     553 kB ( 1%) ggc
 lexical analysis      :   1.28 ( 0%) usr   1.25 ( 7%) sys   2.87 ( 0%) wall       0 kB ( 0%) ggc
 parser                :   1.24 ( 0%) usr   0.64 ( 4%) sys   2.20 ( 0%) wall   15013 kB (25%) ggc
 integration           :   0.12 ( 0%) usr   0.02 ( 0%) sys   0.17 ( 0%) wall     108 kB ( 0%) ggc
 tree gimplify         :   0.46 ( 0%) usr   0.03 ( 0%) sys   0.53 ( 0%) wall   19340 kB (32%) ggc
 tree eh               :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall       0 kB ( 0%) ggc
 tree CFG construction :   0.11 ( 0%) usr   0.02 ( 0%) sys   0.13 ( 0%) wall   17124 kB (28%) ggc
 tree CFG cleanup      :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall       0 kB ( 0%) ggc
 tree STMT verifier    :   0.20 ( 0%) usr   0.01 ( 0%) sys   0.21 ( 0%) wall       0 kB ( 0%) ggc
 callgraph verifier    :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.09 ( 0%) wall     291 kB ( 0%) ggc
 varconst              :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall       0 kB ( 0%) ggc
 llvm backend init     :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall       0 kB ( 0%) ggc
 llvm backend functions:  10.15 ( 1%) usr   0.13 ( 1%) sys  10.90 ( 1%) wall       0 kB ( 0%) ggc
 llvm backend globals  :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall       1 kB ( 0%) ggc
 llvm backend per file : 689.64 (98%) usr  15.27 (84%) sys 814.14 (97%) wall       0 kB ( 0%) ggc
 TOTAL                 : 705.43            18.11           835.17   61276 kB
Extra diagnostic checks enabled; compiler may run slowly.
Configure with --disable-checking to disable checks.