The-OpenROAD-Project / OpenROAD

OpenROAD's unified application implementing an RTL-to-GDS Flow. Documentation at https://openroad.readthedocs.io/en/latest/
https://theopenroadproject.org/
BSD 3-Clause "New" or "Revised" License
1.59k stars 551 forks source link

Detailed routing increases dramatically in running times with area, surprisingly #3484

Closed oharboe closed 1 year ago

oharboe commented 1 year ago

Create settings.mk in flow and use updated mock-array-big https://github.com/The-OpenROAD-Project/OpenROAD-flow-scripts/pull/1097:

export MOCK_ARRAY_PITCH_SCALE=20
export MOCK_ARRAY_DATAWIDTH=64
export DESIGN_CONFIG=designs/asap7/mock-array-big/config.mk

Regenerate Verilog for mock-array-big to get a 64 bit datapath:

make verilog

Now run:

rm -rf results/ objects/ logs/ && make

Output:

[deleted]
(/usr/bin/time -f 'Elapsed time: %E[h:]min:sec. CPU time: user %U sys %S (%P). Peak memory: %MKB.' /home/oyvind/OpenROAD-flow-scripts/tools/install/OpenROAD/bin/openroad -exit -no_init  ./scripts/detail_route.tcl -metrics ./logs/asap7/mock-array-big_Element/base/5_2_TritonRoute.json) 2>&1 | tee ./logs/asap7/mock-array-big_Element/base/5_2_TritonRoute.log
[deleted]
[INFO DRT-0199]   Number of violations = 76.
Viol/Layer          M3     M4     M5
EOL                  0      3      5
Metal Spacing        1     22     22
Recheck              0      1      0
Short                0     13      0
eolKeepOut           0      9      0
[INFO DRT-0267] cpu time = 00:00:15, elapsed time = 00:00:07, memory = 5626.13 (MB), peak = 6145.72 (MB)
[deleted]

image

image

Detailed routing converges to 0 violations on the 3rd iteration, which is quick. However, detailed routing takes 3h30m on my machine, which I believe is mainly a function of the area not the complexity of routing.

[INFO DRT-0195] Start 3rd optimization iteration.
    Completing 10% with 186 violations.
    elapsed time = 00:00:02, memory = 25160.54 (MB).
    Completing 20% with 157 violations.
    elapsed time = 00:00:06, memory = 25160.54 (MB).
    Completing 30% with 133 violations.
    elapsed time = 00:00:07, memory = 25073.98 (MB).
    Completing 40% with 119 violations.
    elapsed time = 00:00:10, memory = 25073.98 (MB).
    Completing 50% with 103 violations.
    elapsed time = 00:00:11, memory = 25073.98 (MB).
    Completing 60% with 74 violations.
    elapsed time = 00:00:13, memory = 25073.98 (MB).
    Completing 70% with 50 violations.
    elapsed time = 00:00:16, memory = 25073.98 (MB).
    Completing 80% with 36 violations.
    elapsed time = 00:00:18, memory = 25073.98 (MB).
    Completing 90% with 14 violations.
    elapsed time = 00:00:19, memory = 25073.98 (MB).
    Completing 100% with 0 violations.
    elapsed time = 00:00:21, memory = 25073.98 (MB).
[INFO DRT-0199]   Number of violations = 0.
[INFO DRT-0267] cpu time = 00:01:15, elapsed time = 00:00:21, memory = 25073.98 (MB), peak = 26370.69 (MB)
Total wire length = 702685 um.
Total wire length on LAYER M1 = 0 um.
Total wire length on LAYER M2 = 156754 um.
Total wire length on LAYER M3 = 177891 um.
Total wire length on LAYER M4 = 186636 um.
Total wire length on LAYER M5 = 136662 um.
Total wire length on LAYER M6 = 15047 um.
Total wire length on LAYER M7 = 29693 um.
Total wire length on LAYER M8 = 0 um.
Total wire length on LAYER M9 = 0 um.
Total wire length on LAYER Pad = 0 um.
Total number of vias = 128911.
Up-via summary (total 128911):.

-----------------
 Active         0
     M1     19584
     M2     38565
     M3     39480
     M4     24357
     M5      4386
     M6      2539
     M7         0
     M8         0
     M9         0
-----------------
           128911

[INFO DRT-0198] Complete detail routing.
Total wire length = 702685 um.
Total wire length on LAYER M1 = 0 um.
Total wire length on LAYER M2 = 156754 um.
Total wire length on LAYER M3 = 177891 um.
Total wire length on LAYER M4 = 186636 um.
Total wire length on LAYER M5 = 136662 um.
Total wire length on LAYER M6 = 15047 um.
Total wire length on LAYER M7 = 29693 um.
Total wire length on LAYER M8 = 0 um.
Total wire length on LAYER M9 = 0 um.
Total wire length on LAYER Pad = 0 um.
Total number of vias = 128911.
Up-via summary (total 128911):.

-----------------
 Active         0
     M1     19584
     M2     38565
     M3     39480
     M4     24357
     M5      4386
     M6      2539
     M7         0
     M8         0
     M9         0
-----------------
           128911

[INFO DRT-0267] cpu time = 54:13:20, elapsed time = 03:32:01, memory = 25073.98 (MB), peak = 26370.69 (MB)

[INFO DRT-0180] Post processing.
Elapsed time: 3:34:35[h:]min:sec. CPU time: user 195583.66 sys 95.61 (1519%). Peak memory: 27003588KB.
$ cat /proc/cpuinfo | grep processor  | wc -l
16

Originally posted by @oharboe in https://github.com/The-OpenROAD-Project/OpenROAD/issues/3468#issuecomment-1593775371

oharboe commented 1 year ago

Tried on my laptop, I want to record a run here, so I have something to compare against later...

$ cat /proc/cpuinfo | grep processor  | wc -l
12
Elapsed time: 0:49.45[h:]min:sec. CPU time: user 47.18 sys 2.23 (99%). Peak memory: 6078204KB.
cp results/asap7/mock-array-big/base/6_1_merged.gds results/asap7/mock-array-big/base/6_final.gds
Log                       Elapsed seconds
1_1_yosys                          2
2_1_floorplan                      1
2_2_floorplan_io                   1
2_3_tdms_place                     1
2_4_mplace                         1
2_5_tapcell                        1
2_6_pdn                          407
3_1_place_gp_skip_io             451
3_2_place_iop                      4
3_3_place_gp                     455
3_4_resizer                       17
3_5_opendp                        24
4_1_cts                           46
4_2_cts_fillcell                  25
5_1_fastroute                     48
5_2_TritonRoute                14151
6_1_merge                         49
6_report                        5669
oharboe commented 1 year ago

@osamahammad21 FYI, there's an update to mock-array-big https://github.com/The-OpenROAD-Project/OpenROAD-flow-scripts/pull/1153

With this fix, you can use settings.mk below, which only increases the area, it doesn't increase the data width in the mock-array-big and there is no need to run make verilog.

I haven't run it to completion to see how long it takes, but if it takes substantially longer than with default MOCK_ARRAY_PITCH_SCALE, then it is worth looking at detailed route times on this smaller example first, I think.

export MOCK_ARRAY_PITCH_SCALE=20
export DESIGN_CONFIG=designs/asap7/mock-array-big/config.mk

I didn't wait for it to complete, but the datapath width doesn't appear to be a factor in the detailed routing speed.

The detailed routing speed is dominated by the area of the design. From memory, it appears that the detailed routing speed is roughly the same for 8 and 64 bit data path for the same die size of mock-array-big:

[INFO DRT-0195] Start 0th optimization iteration.
    Completing 10% with 44 violations.
    elapsed time = 00:12:06, memory = 23299.84 (MB).
    Completing 20% with 75 violations.
    elapsed time = 00:24:06, memory = 23285.89 (MB).
osamahammad21 commented 1 year ago

I am running the test case and I can see that the array element is not DRC free which I guess for the router later on

oharboe commented 1 year ago

@osamahammad21 I see.

Do you need any further input from me to investigate the running time for the detailed router on this example?

oharboe commented 1 year ago

@osamahammad21 Could you provide some before and after numbers based upon the changes you are planning/have pull requests for?

osamahammad21 commented 1 year ago

@oharboe on a 96 CPU machine: before: [INFO DRT-0267] cpu time = 87:08:00, elapsed time = 02:48:43, memory = 10662.26 (MB), peak = 28213.59 (MB) after: [INFO DRT-0267] cpu time = 06:10:19, elapsed time = 00:09:20, memory = 12216.96 (MB), peak = 15553.88 (MB)