Closed Quuxplusone closed 16 years ago
Attached testcase.i.gz
(160365 bytes, text/plain): Preprocessed output of insn-attrtab.c
Compile with this:
$ llvm-gcc -g -O2 -mdynamic-no-pic -fno-common testcase.i
Yes, it seems, that checked build is much slower. Release is much faster. I really don't think this is a bug.
Alright, I'll go ahead and mark it as "invalid". It just seemed extra long to me. :-)
What does "opt -time-passes" or llvm-gcc -ftime-report say for the file? Any outliers?
If I remember correctly, one pass dominates. If I'm right - it was GCSE. I'll recheck, actually.
If so, Owen should take a look. Hopefully GCSE will die next week.
Here's what -ftime-report gives:
===-------------------------------------------------------------------------===
... Pass execution timing report ...
===-------------------------------------------------------------------------===
Total Execution Time: 713.2354 seconds (822.7144 wall clock)
---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
235.7063 ( 33.7%) 1.1100 ( 7.2%) 236.8163 ( 33.2%) 274.5477 ( 33.3%) Linear Scan Register Allocator
181.3386 ( 25.9%) 0.8839 ( 5.7%) 182.2226 ( 25.5%) 207.6889 ( 25.2%) Global Common Subexpression Elimination
61.9282 ( 8.8%) 10.1908 ( 66.3%) 72.1191 ( 10.1%) 85.9039 ( 10.4%) Simple Register Coalescing
30.2452 ( 4.3%) 0.9935 ( 6.4%) 31.2387 ( 4.3%) 36.1858 ( 4.3%) Break critical edges in CFG
29.6865 ( 4.2%) 0.1715 ( 1.1%) 29.8581 ( 4.1%) 35.2829 ( 4.2%) Simplify the CFG
29.2335 ( 4.1%) 0.9163 ( 5.9%) 30.1499 ( 4.2%) 33.7817 ( 4.1%) Break critical edges in CFG
27.2445 ( 3.9%) 0.1403 ( 0.9%) 27.3848 ( 3.8%) 30.4702 ( 3.7%) Simplify the CFG
18.8636 ( 2.7%) 0.1431 ( 0.9%) 19.0068 ( 2.6%) 22.7183 ( 2.7%) X86 DAG->DAG Instruction Selection
19.2656 ( 2.7%) 0.1013 ( 0.6%) 19.3670 ( 2.7%) 21.4663 ( 2.6%) Simplify the CFG
12.7784 ( 1.8%) 0.0827 ( 0.5%) 12.8611 ( 1.8%) 15.0353 ( 1.8%) Post-Dominator Tree Construction
12.9369 ( 1.8%) 0.0639 ( 0.4%) 13.0008 ( 1.8%) 14.5189 ( 1.7%) Control Flow Optimizer
6.0786 ( 0.8%) 0.1361 ( 0.8%) 6.2147 ( 0.8%) 7.5407 ( 0.9%) Live Interval Analysis
5.6771 ( 0.8%) 0.0494 ( 0.3%) 5.7266 ( 0.8%) 6.1483 ( 0.7%) Combine redundant instructions
4.9721 ( 0.7%) 0.0367 ( 0.2%) 5.0088 ( 0.7%) 5.7271 ( 0.6%) Optimize for code generation
3.0945 ( 0.4%) 0.0722 ( 0.4%) 3.1667 ( 0.4%) 3.8543 ( 0.4%) Live Variable Analysis
2.0836 ( 0.2%) 0.0097 ( 0.0%) 2.0933 ( 0.2%) 2.5026 ( 0.3%) Live Variable Analysis
1.5289 ( 0.2%) 0.0109 ( 0.0%) 1.5399 ( 0.2%) 1.7336 ( 0.2%) Aggressive Dead Code Elimination
0.8338 ( 0.1%) 0.0089 ( 0.0%) 0.8428 ( 0.1%) 1.0520 ( 0.1%) Combine redundant instructions
0.7325 ( 0.1%) 0.0085 ( 0.0%) 0.7410 ( 0.1%) 0.9187 ( 0.1%) Sparse Conditional Constant Propagation
0.7795 ( 0.1%) 0.0114 ( 0.0%) 0.7910 ( 0.1%) 0.8876 ( 0.1%) Combine redundant instructions
0.7590 ( 0.1%) 0.0028 ( 0.0%) 0.7619 ( 0.1%) 0.7772 ( 0.0%) Simplify the CFG
0.6513 ( 0.0%) 0.0063 ( 0.0%) 0.6577 ( 0.0%) 0.7586 ( 0.0%) Simplify the CFG
0.6442 ( 0.0%) 0.0035 ( 0.0%) 0.6478 ( 0.0%) 0.7002 ( 0.0%) Simplify the CFG
0.5547 ( 0.0%) 0.0088 ( 0.0%) 0.5635 ( 0.0%) 0.6857 ( 0.0%) Combine redundant instructions
0.4752 ( 0.0%) 0.0466 ( 0.3%) 0.5219 ( 0.0%) 0.6720 ( 0.0%) X86 AT&T-Style Assembly Printer
0.5489 ( 0.0%) 0.0063 ( 0.0%) 0.5552 ( 0.0%) 0.6372 ( 0.0%) Combine redundant instructions
0.5493 ( 0.0%) 0.0065 ( 0.0%) 0.5558 ( 0.0%) 0.6166 ( 0.0%) Combine redundant instructions
0.4535 ( 0.0%) 0.0023 ( 0.0%) 0.4559 ( 0.0%) 0.6112 ( 0.0%) Post-Dominance Frontier Construction
0.5203 ( 0.0%) 0.0068 ( 0.0%) 0.5272 ( 0.0%) 0.5720 ( 0.0%) Combine redundant instructions
0.4846 ( 0.0%) 0.0063 ( 0.0%) 0.4909 ( 0.0%) 0.5261 ( 0.0%) Scalar Replacement of Aggregates
0.3469 ( 0.0%) 0.0085 ( 0.0%) 0.3554 ( 0.0%) 0.5154 ( 0.0%) Dominator Tree Construction
0.4110 ( 0.0%) 0.0086 ( 0.0%) 0.4196 ( 0.0%) 0.4379 ( 0.0%) Dominator Tree Construction
0.3732 ( 0.0%) 0.0086 ( 0.0%) 0.3818 ( 0.0%) 0.4299 ( 0.0%) Dominator Tree Construction
0.4118 ( 0.0%) 0.0100 ( 0.0%) 0.4218 ( 0.0%) 0.4258 ( 0.0%) Dominator Tree Construction
0.3731 ( 0.0%) 0.0099 ( 0.0%) 0.3830 ( 0.0%) 0.4139 ( 0.0%) Dominator Tree Construction
0.3509 ( 0.0%) 0.0078 ( 0.0%) 0.3587 ( 0.0%) 0.4051 ( 0.0%) Dominator Tree Construction
0.3435 ( 0.0%) 0.0094 ( 0.0%) 0.3530 ( 0.0%) 0.4044 ( 0.0%) Dominator Tree Construction
0.3675 ( 0.0%) 0.0083 ( 0.0%) 0.3759 ( 0.0%) 0.3996 ( 0.0%) Dominator Tree Construction
0.3639 ( 0.0%) 0.0075 ( 0.0%) 0.3715 ( 0.0%) 0.3786 ( 0.0%) Dominator Tree Construction
0.3555 ( 0.0%) 0.0008 ( 0.0%) 0.3563 ( 0.0%) 0.3631 ( 0.0%) Module Verifier
0.3024 ( 0.0%) 0.0018 ( 0.0%) 0.3043 ( 0.0%) 0.3316 ( 0.0%) Reassociate expressions
0.2706 ( 0.0%) 0.0017 ( 0.0%) 0.2723 ( 0.0%) 0.3049 ( 0.0%) Prolog/Epilog Insertion & Frame Finalization
0.2509 ( 0.0%) 0.0013 ( 0.0%) 0.2523 ( 0.0%) 0.2760 ( 0.0%) Dead Store Elimination
0.2394 ( 0.0%) 0.0053 ( 0.0%) 0.2447 ( 0.0%) 0.2584 ( 0.0%) Dominance Frontier Construction
0.2105 ( 0.0%) 0.0027 ( 0.0%) 0.2132 ( 0.0%) 0.2581 ( 0.0%) Dominance Frontier Construction
0.2082 ( 0.0%) 0.0018 ( 0.0%) 0.2100 ( 0.0%) 0.2309 ( 0.0%) Dominance Frontier Construction
0.2018 ( 0.0%) 0.0014 ( 0.0%) 0.2033 ( 0.0%) 0.2065 ( 0.0%) Dominance Frontier Construction
0.1813 ( 0.0%) 0.0011 ( 0.0%) 0.1825 ( 0.0%) 0.2063 ( 0.0%) Machine Code Deleter
0.1272 ( 0.0%) 0.0012 ( 0.0%) 0.1285 ( 0.0%) 0.1596 ( 0.0%) Natural Loop Construction
0.1361 ( 0.0%) 0.0015 ( 0.0%) 0.1377 ( 0.0%) 0.1545 ( 0.0%) Natural Loop Construction
0.1271 ( 0.0%) 0.0008 ( 0.0%) 0.1280 ( 0.0%) 0.1533 ( 0.0%) Conditional Propagation
0.1285 ( 0.0%) 0.0012 ( 0.0%) 0.1298 ( 0.0%) 0.1488 ( 0.0%) Natural Loop Construction
0.1154 ( 0.0%) 0.0006 ( 0.0%) 0.1161 ( 0.0%) 0.1460 ( 0.0%) Tail Call Elimination
0.1284 ( 0.0%) 0.0007 ( 0.0%) 0.1291 ( 0.0%) 0.1457 ( 0.0%) Conditional Propagation
0.1259 ( 0.0%) 0.0005 ( 0.0%) 0.1265 ( 0.0%) 0.1429 ( 0.0%) Remove unreachable blocks from the CFG
0.1259 ( 0.0%) 0.0012 ( 0.0%) 0.1271 ( 0.0%) 0.1387 ( 0.0%) Natural Loop Construction
0.1144 ( 0.0%) 0.0002 ( 0.0%) 0.1147 ( 0.0%) 0.1357 ( 0.0%) Dead Global Elimination
0.1237 ( 0.0%) 0.0004 ( 0.0%) 0.1241 ( 0.0%) 0.1311 ( 0.0%) X86 FP Stackifier
0.0658 ( 0.0%) 0.0003 ( 0.0%) 0.0661 ( 0.0%) 0.0815 ( 0.0%) Two-Address instruction pass
0.0730 ( 0.0%) 0.0004 ( 0.0%) 0.0735 ( 0.0%) 0.0797 ( 0.0%) Unify function exit nodes
0.0690 ( 0.0%) 0.0004 ( 0.0%) 0.0695 ( 0.0%) 0.0768 ( 0.0%) Remove unused exception handling info
0.0357 ( 0.0%) 0.0005 ( 0.0%) 0.0363 ( 0.0%) 0.0386 ( 0.0%) Basic CallGraph Construction
0.0319 ( 0.0%) 0.0002 ( 0.0%) 0.0321 ( 0.0%) 0.0383 ( 0.0%) Subregister lowering instruction pass
0.0173 ( 0.0%) 0.0002 ( 0.0%) 0.0175 ( 0.0%) 0.0307 ( 0.0%) Tail Duplication
0.0220 ( 0.0%) 0.0001 ( 0.0%) 0.0222 ( 0.0%) 0.0291 ( 0.0%) Label Folder
0.0263 ( 0.0%) 0.0001 ( 0.0%) 0.0265 ( 0.0%) 0.0271 ( 0.0%) Canonicalize natural loops
0.0244 ( 0.0%) 0.0001 ( 0.0%) 0.0245 ( 0.0%) 0.0249 ( 0.0%) Canonicalize natural loops
0.0222 ( 0.0%) 0.0001 ( 0.0%) 0.0224 ( 0.0%) 0.0232 ( 0.0%) Eliminate PHI nodes for register allocation
0.0189 ( 0.0%) 0.0001 ( 0.0%) 0.0190 ( 0.0%) 0.0223 ( 0.0%) Lower invoke and unwind, for unwindless code generators
0.0033 ( 0.0%) 0.0000 ( 0.0%) 0.0033 ( 0.0%) 0.0038 ( 0.0%) Global Variable Optimizer
0.0030 ( 0.0%) 0.0000 ( 0.0%) 0.0030 ( 0.0%) 0.0031 ( 0.0%) Merge Duplicate Global Constants
0.0025 ( 0.0%) 0.0000 ( 0.0%) 0.0025 ( 0.0%) 0.0025 ( 0.0%) Interprocedural constant propagation
0.0005 ( 0.0%) 0.0000 ( 0.0%) 0.0005 ( 0.0%) 0.0008 ( 0.0%) Simplify well-known library calls
0.0005 ( 0.0%) 0.0001 ( 0.0%) 0.0006 ( 0.0%) 0.0006 ( 0.0%) Scalar Evolution Analysis
0.0004 ( 0.0%) 0.0000 ( 0.0%) 0.0005 ( 0.0%) 0.0005 ( 0.0%) Scalar Replacement of Aggregates
0.0004 ( 0.0%) 0.0000 ( 0.0%) 0.0005 ( 0.0%) 0.0005 ( 0.0%) Dead Argument Elimination
0.0003 ( 0.0%) 0.0000 ( 0.0%) 0.0004 ( 0.0%) 0.0004 ( 0.0%) Scalar Evolution Analysis
0.0002 ( 0.0%) 0.0000 ( 0.0%) 0.0003 ( 0.0%) 0.0003 ( 0.0%) Promote Memory to Register
0.0002 ( 0.0%) 0.0000 ( 0.0%) 0.0003 ( 0.0%) 0.0003 ( 0.0%) Scalar Evolution Analysis
0.0002 ( 0.0%) 0.0000 ( 0.0%) 0.0002 ( 0.0%) 0.0002 ( 0.0%) Memory Dependence Analysis
0.0002 ( 0.0%) 0.0000 ( 0.0%) 0.0002 ( 0.0%) 0.0002 ( 0.0%) Scalar Evolution Analysis
0.0001 ( 0.0%) 0.0000 ( 0.0%) 0.0002 ( 0.0%) 0.0002 ( 0.0%) Load Value Numbering
0.0001 ( 0.0%) 0.0000 ( 0.0%) 0.0001 ( 0.0%) 0.0001 ( 0.0%) Post RA top-down list latency scheduler (STUB)
0.0001 ( 0.0%) 0.0000 ( 0.0%) 0.0001 ( 0.0%) 0.0001 ( 0.0%) Lower GC intrinsics, for GCless code generators
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) Target Data Layout
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) Strip Unused Function Prototypes
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) Raise allocations from calls to instructions
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) Target Data Layout
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) Basic Alias Analysis (default AA impl)
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) Basic Value Numbering (default GVN impl)
697.8872 ( 99.9%) 15.3482 (100.0%) 713.2354 (100.0%) 822.7144 (100.0%) TOTAL
Execution times (seconds)
garbage collection : 0.21 ( 0%) usr 0.00 ( 0%) sys 0.22 ( 0%) wall 0 kB ( 0%) ggc
callgraph construction: 0.41 ( 0%) usr 0.03 ( 0%) sys 0.45 ( 0%) wall 7466 kB (12%) ggc
callgraph optimization: 0.25 ( 0%) usr 0.00 ( 0%) sys 0.25 ( 0%) wall 703 kB ( 1%) ggc
ipa reference : 0.09 ( 0%) usr 0.02 ( 0%) sys 0.13 ( 0%) wall 0 kB ( 0%) ggc
ipa type escape : 0.11 ( 0%) usr 0.02 ( 0%) sys 0.17 ( 0%) wall 0 kB ( 0%) ggc
CFG verifier : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall 0 kB ( 0%) ggc
preprocessing : 0.89 ( 0%) usr 0.63 ( 3%) sys 1.71 ( 0%) wall 553 kB ( 1%) ggc
lexical analysis : 1.28 ( 0%) usr 1.25 ( 7%) sys 2.87 ( 0%) wall 0 kB ( 0%) ggc
parser : 1.24 ( 0%) usr 0.64 ( 4%) sys 2.20 ( 0%) wall 15013 kB (25%) ggc
integration : 0.12 ( 0%) usr 0.02 ( 0%) sys 0.17 ( 0%) wall 108 kB ( 0%) ggc
tree gimplify : 0.46 ( 0%) usr 0.03 ( 0%) sys 0.53 ( 0%) wall 19340 kB (32%) ggc
tree eh : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 0 kB ( 0%) ggc
tree CFG construction : 0.11 ( 0%) usr 0.02 ( 0%) sys 0.13 ( 0%) wall 17124 kB (28%) ggc
tree CFG cleanup : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.05 ( 0%) wall 0 kB ( 0%) ggc
tree STMT verifier : 0.20 ( 0%) usr 0.01 ( 0%) sys 0.21 ( 0%) wall 0 kB ( 0%) ggc
callgraph verifier : 0.08 ( 0%) usr 0.00 ( 0%) sys 0.09 ( 0%) wall 291 kB ( 0%) ggc
varconst : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc
llvm backend init : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc
llvm backend functions: 10.15 ( 1%) usr 0.13 ( 1%) sys 10.90 ( 1%) wall 0 kB ( 0%) ggc
llvm backend globals : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 1 kB ( 0%) ggc
llvm backend per file : 689.64 (98%) usr 15.27 (84%) sys 814.14 (97%) wall 0 kB ( 0%) ggc
TOTAL : 705.43 18.11 835.17 61276 kB
Extra diagnostic checks enabled; compiler may run slowly.
Configure with --disable-checking to disable checks.
testcase.i.gz
(160365 bytes, text/plain)The attached file takes >11 minutes to compile during a (debug) bootstrap of LLVM-GCC 4.2. It's the preprocessed output of the generated insn-attrtab.c file. GCC takes only a few minutes to compile it.