Open Quuxplusone opened 5 years ago
Attached pan.tgz
(350255 bytes, application/gzip): from clangBug-14651
The problem lies in "Simple Register Coalescing":
clang -ftime-report -DHC4 -DSAFETY -DNOREDUCE -DNFAIR=3 -O1 -o files pan.c
===-------------------------------------------------------------------------===
Register Allocation
===-------------------------------------------------------------------------===
Total Execution Time: 0.0372 seconds (0.0373 wall clock)
......(fast)
===-------------------------------------------------------------------------===
Instruction Selection and Scheduling
===-------------------------------------------------------------------------===
Total Execution Time: 1.9975 seconds (1.9931 wall clock)
......(fast)
===-------------------------------------------------------------------------===
DWARF Emission
===-------------------------------------------------------------------------===
Total Execution Time: 0.0012 seconds (0.0012 wall clock)
......(fast)
===-------------------------------------------------------------------------===
... Pass execution timing report ...
===-------------------------------------------------------------------------===
Total Execution Time: 36.5249 seconds (36.5250 wall clock)
---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
28.0486 ( 77.7%) 0.0018 ( 0.4%) 28.0504 ( 76.8%) 28.0517 ( 76.8%) Simple Register Coalescing (!!PROBLEM!!)
2.6494 ( 7.3%) 0.3306 ( 75.9%) 2.9800 ( 8.2%) 2.9801 ( 8.2%) X86 DAG->DAG Instruction Selection
0.6952 ( 1.9%) 0.0118 ( 2.7%) 0.7070 ( 1.9%) 0.7070 ( 1.9%) Greedy Register Allocator
0.6730 ( 1.9%) 0.0011 ( 0.3%) 0.6741 ( 1.8%) 0.6742 ( 1.8%) Simplify the CFG
0.4630 ( 1.3%) 0.0026 ( 0.6%) 0.4657 ( 1.3%) 0.4656 ( 1.3%) Combine redundant instructions
.....
36.0891 (100.0%) 0.4358 (100.0%) 36.5249 (100.0%) 36.5250 (100.0%) Total
===-------------------------------------------------------------------------===
Miscellaneous Ungrouped Timers
===-------------------------------------------------------------------------===
---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
36.7598 ( 50.2%) 0.5495 ( 52.1%) 37.3093 ( 50.3%) 37.3111 ( 50.3%) Clang front-end timer
36.2154 ( 49.5%) 0.4579 ( 43.4%) 36.6733 ( 49.4%) 36.6750 ( 49.4%) Code Generation Time
0.1997 ( 0.3%) 0.0475 ( 4.5%) 0.2472 ( 0.3%) 0.2474 ( 0.3%) LLVM IR Generation Time
73.1750 (100.0%) 1.0549 (100.0%) 74.2298 (100.0%) 74.2335 (100.0%) Total
clang -ftime-report -DHC4 -DSAFETY -DNOREDUCE -DNFAIR=3 -O1 -o files pan.c
===-------------------------------------------------------------------------===
Register Allocation
===-------------------------------------------------------------------------===
Total Execution Time: 0.1130 seconds (0.1127 wall clock)
---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
0.0514 ( 51.9%) 0.0070 ( 50.2%) 0.0584 ( 51.7%) 0.0580 ( 51.5%) Global Splitting
0.0195 ( 19.6%) 0.0025 ( 17.8%) 0.0219 ( 19.4%) 0.0217 ( 19.2%) Spiller
0.0147 ( 14.8%) 0.0041 ( 29.4%) 0.0187 ( 16.6%) 0.0191 ( 17.0%) Evict
0.0134 ( 13.6%) 0.0001 ( 0.6%) 0.0135 ( 12.0%) 0.0135 ( 12.0%) Seed Live Regs
0.0001 ( 0.1%) 0.0003 ( 2.0%) 0.0004 ( 0.4%) 0.0004 ( 0.4%) Local Splitting
0.0991 (100.0%) 0.0139 (100.0%) 0.1130 (100.0%) 0.1127 (100.0%) Total
===-------------------------------------------------------------------------===
Instruction Selection and Scheduling
===-------------------------------------------------------------------------===
Total Execution Time: 3.0434 seconds (3.0434 wall clock)
......(fast)
===-------------------------------------------------------------------------===
DWARF Emission
===-------------------------------------------------------------------------===
Total Execution Time: 0.0014 seconds (0.0015 wall clock)
......(fast)
===-------------------------------------------------------------------------===
... Pass execution timing report ...
===-------------------------------------------------------------------------===
Total Execution Time: 26.9566 seconds (26.9558 wall clock)
---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
5.6477 ( 21.6%) 0.0010 ( 0.1%) 5.6487 ( 21.0%) 5.6489 ( 21.0%) Simple Register Coalescing (when using -O1, this use 28sec !!!)
4.7472 ( 18.2%) 0.6485 ( 76.5%) 5.3957 ( 20.0%) 5.3959 ( 20.0%) X86 DAG->DAG Instruction Selection
3.4128 ( 13.1%) 0.0288 ( 3.4%) 3.4416 ( 12.8%) 3.4417 ( 12.8%) Global Value Numbering
1.7108 ( 6.6%) 0.0003 ( 0.0%) 1.7112 ( 6.3%) 1.7112 ( 6.3%) Eliminate PHI nodes for register allocation
0.7738 ( 3.0%) 0.0004 ( 0.0%) 0.7742 ( 2.9%) 0.7742 ( 2.9%) Control Flow Optimizer
0.7101 ( 2.7%) 0.0005 ( 0.1%) 0.7106 ( 2.6%) 0.7106 ( 2.6%) Merge disjoint stack slots
0.6681 ( 2.6%) 0.0002 ( 0.0%) 0.6683 ( 2.5%) 0.6683 ( 2.5%) Simplify the CFG
...
===-------------------------------------------------------------------------===
Miscellaneous Ungrouped Timers
===-------------------------------------------------------------------------===
---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
26.8342 ( 50.3%) 0.9470 ( 50.7%) 27.7812 ( 50.4%) 27.7824 ( 50.4%) Clang front-end timer
26.2796 ( 49.3%) 0.8709 ( 46.6%) 27.1505 ( 49.2%) 27.1517 ( 49.2%) Code Generation Time
0.1917 ( 0.4%) 0.0501 ( 2.7%) 0.2418 ( 0.4%) 0.2421 ( 0.4%) LLVM IR Generation Time
53.3055 (100.0%) 1.8680 (100.0%) 55.1735 (100.0%) 55.1762 (100.0%) Total
Working on it.
pan.tgz
(350255 bytes, application/gzip)