Open litghost opened 5 years ago
I've started a route run, but the results don't look good from a router congestion standpoint. The first run starts as follows:
---- ------ ------- ---- ------- ------- ------- ----------------- --------------- -------- ---------- ---------- ---------- ---------- --------
Iter Time pres BBs Heap Re-Rtd Re-Rtd Overused RR Nodes Wirelength CPD sTNS sWNS hTNS hWNS Est Succ
(sec) fac Updt push Nets Conns (ns) (ns) (ns) (ns) (ns) Iter
---- ------ ------- ---- ------- ------- ------- ----------------- --------------- -------- ---------- ---------- ---------- ---------- --------
Warning 109: No routing path for connection to sink_rr 1816053, retrying with full device bounding box
Warning 110: 90 timing endpoints were not constrained during timing analysis
1 5.4 0.0 0 3.5e+07 5458 16533 18592 ( 0.545%) 617038 ( 7.3%) 17.930 -1.826e+04 -17.930 -0.07147 -0.036 N/A
2 70.0 0.5 69 3.4e+08 4281 13447 15423 ( 0.452%) 441122 ( 5.2%) 18.333 -1.998e+04 -18.333 -0.2645 -0.132 N/A
3 210.6 0.6 47 7.9e+08 3961 12841 14195 ( 0.416%) 449635 ( 5.3%) 17.925 -2.075e+04 -17.925 -0.2913 -0.146 N/A
4 343.8 0.8 66 1.1e+09 3722 12422 12441 ( 0.365%) 457006 ( 5.4%) 17.704 -2.123e+04 -17.704 0.000 0.000 N/A
The number of overused rr nodes actually increased! It will take more time to see what will happen after further iterations.
FYI, this is using the maximum site pin delay in the lookahead.
For comparision, here the router log for the graph without LUT rotation support:
---- ------ ------- ---- ------- ------- ------- ----------------- --------------- -------- ---------- ---------- ---------- ---------- --------
Iter Time pres BBs Heap Re-Rtd Re-Rtd Overused RR Nodes Wirelength CPD sTNS sWNS hTNS hWNS Est Succ
(sec) fac Updt push Nets Conns (ns) (ns) (ns) (ns) (ns) Iter
---- ------ ------- ---- ------- ------- ------- ----------------- --------------- -------- ---------- ---------- ---------- ---------- --------
Warning 109: 90 timing endpoints were not constrained during timing analysis
1 13.7 0.0 0 7.1e+07 5615 16933 12195 ( 0.345%) 772474 ( 9.1%) 17.971 -1.959e+04 -17.971 0.000 0.000 N/A
2 117.8 0.5 108 5.3e+08 4558 13917 10500 ( 0.297%) 506006 ( 6.0%) 18.251 -2.156e+04 -18.251 0.000 0.000 N/A
3 266.2 0.6 41 1.1e+09 4335 13450 9616 ( 0.272%) 514247 ( 6.1%) 18.095 -2.240e+04 -18.095 0.000 0.000 N/A
4 360.9 0.8 71 1.5e+09 4102 12972 8808 ( 0.249%) 522693 ( 6.2%) 18.208 -2.302e+04 -18.208 0.000 0.000 N/A
5 372.9 1.1 72 1.6e+09 3810 12351 7008 ( 0.198%) 529215 ( 6.2%) 18.118 -2.333e+04 -18.118 0.000 0.000 N/A
6 390.6 1.4 52 1.7e+09 3491 11444 5575 ( 0.158%) 538942 ( 6.3%) 18.113 -2.361e+04 -18.113 0.000 0.000 N/A
7 375.3 1.9 43 1.6e+09 3037 10209 4317 ( 0.122%) 545791 ( 6.4%) 18.121 -2.397e+04 -18.121 0.000 0.000 N/A
8 318.0 2.4 25 1.4e+09 2605 8787 3099 ( 0.088%) 553873 ( 6.5%) 18.125 -2.414e+04 -18.125 0.000 0.000 N/A
9 261.2 3.1 18 1.1e+09 2143 7317 2100 ( 0.059%) 560288 ( 6.6%) 18.091 -2.423e+04 -18.091 0.000 0.000 N/A
10 185.3 4.1 20 8.3e+08 1690 5650 1367 ( 0.039%) 565327 ( 6.7%) 18.054 -2.432e+04 -18.054 0.000 0.000 35
11 127.1 5.3 11 5.6e+08 1260 4214 885 ( 0.025%) 568271 ( 6.7%) 18.073 -2.444e+04 -18.073 0.000 0.000 31
12 74.4 6.9 7 3.3e+08 898 2913 605 ( 0.017%) 570946 ( 6.7%) 18.137 -2.447e+04 -18.137 0.000 0.000 29
13 51.2 9.0 7 2.3e+08 678 2219 395 ( 0.011%) 572867 ( 6.7%) 18.135 -2.450e+04 -18.135 0.000 0.000 28
14 36.4 11.6 6 1.5e+08 500 1645 260 ( 0.007%) 574420 ( 6.8%) 18.135 -2.451e+04 -18.135 0.000 0.000 28
15 29.0 15.1 3 1.1e+08 380 1289 162 ( 0.005%) 575277 ( 6.8%) 18.135 -2.454e+04 -18.135 0.000 0.000 27
16 21.2 19.7 3 7.8e+07 296 1034 118 ( 0.003%) 575974 ( 6.8%) 18.135 -2.454e+04 -18.135 0.000 0.000 27
17 13.5 25.6 3 5.2e+07 253 865 71 ( 0.002%) 576680 ( 6.8%) 18.121 -2.455e+04 -18.121 0.000 0.000 27
18 13.0 33.3 3 4.7e+07 208 762 56 ( 0.002%) 577071 ( 6.8%) 18.121 -2.455e+04 -18.121 0.000 0.000 27
19 12.7 43.3 1 4.0e+07 196 711 34 ( 0.001%) 577711 ( 6.8%) 18.121 -2.456e+04 -18.121 0.000 0.000 28
20 7.4 56.2 1 2.3e+07 172 641 22 ( 0.001%) 577900 ( 6.8%) 18.121 -2.457e+04 -18.121 0.000 0.000 28
21 5.2 73.1 1 1.8e+07 167 613 18 ( 0.001%) 577818 ( 6.8%) 18.121 -2.457e+04 -18.121 0.000 0.000 28
22 5.2 95.0 2 1.6e+07 158 584 11 ( 0.000%) 578063 ( 6.8%) 18.121 -2.457e+04 -18.121 0.000 0.000 28
23 4.9 123.5 1 1.7e+07 157 585 4 ( 0.000%) 578290 ( 6.8%) 18.183 -2.462e+04 -18.183 0.000 0.000 28
24 2.3 160.6 0 8605946 150 556 5 ( 0.000%) 578313 ( 6.8%) 18.121 -2.457e+04 -18.121 0.000 0.000 27
25 3.3 208.8 0 1.1e+07 152 571 2 ( 0.000%) 578396 ( 6.8%) 18.183 -2.462e+04 -18.183 0.000 0.000 28
26 1.5 271.4 0 6300157 149 556 3 ( 0.000%) 578496 ( 6.8%) 18.121 -2.457e+04 -18.121 0.000 0.000 27
27 2.5 352.8 1 9001826 152 564 2 ( 0.000%) 578430 ( 6.8%) 18.121 -2.457e+04 -18.121 0.000 0.000 28
28 1.2 458.7 0 5443741 151 563 1 ( 0.000%) 578520 ( 6.8%) 18.121 -2.459e+04 -18.121 0.000 0.000 28
29 6.5 596.3 1 1.6e+07 155 581 2 ( 0.000%) 578549 ( 6.8%) 18.121 -2.457e+04 -18.121 0.000 0.000 28
30 1.2 775.1 0 5478818 151 563 0 ( 0.000%) 578623 ( 6.8%) 18.686 -2.479e+04 -18.686 0.000 0.000 29
Having more congestion initially might be due to everything wanting the fastest LUT input. Hopefully will still converge faster. You may also want to try a run with max_criticality clipped to something small (e.g. 0.1, or 0) so the router doesn't care that one input is faster.
Some results are in.
LUT equivilance | pres_fac_mult | acc_fac | max_criticality | CPD (ns) | Runtime (sec) | A* factor | BB factor | first_iter_pres_fac | initial_pres_fac | Reconvergence count | Iterations |
---|---|---|---|---|---|---|---|---|---|---|---|
On | 1.3 | 1 | 0.99 | 18.4141 | 2884.13 | 1.2 | 10 | 0 | 0.5 | 1 | 34 |
On | 2 | 1 | 0.99 | 71.9663 | 1944.88 | 1.2 | 10 | 0 | 0.5 | 1 | 22 |
On | 1.3 | 1 | 0.1 | 33.5886 | 2998.58 | 1.2 | 10 | 0 | 0.5 | 1 | 17 |
On | 2 | 2 | 0.99 | 18.1179 | 1747.27 | 1.2 | 10 | 0 | 0.5 | 1 | 12 |
Off | 1.3 | 1 | 0.99 | 20.1673 | 2310.6 | 1.2 | 10 | 0 | 0.5 | 1 | 30 |
Off | 2 | 1 | 0.99 | 22.397 | 1403.21 | 1.2 | 10 | 0 | 0.5 | 1 | 16 |
Off | 2 | 1 | 0.1 | 35.1067 | 1939.35 | 1.2 | 10 | 0 | 0.5 | 1 | 18 |
Thanks. The fact that max_crit = 0.1 takes longer than max_crit = 0.99 is very strange. It implies something weird is happening; maybe the lookahead is not predicting the total base_cost of the resources expected on the path well?
Adding @YFWang97 @xuqinziyue @cindyhou to the discussion as they're working on Symbiflow quality.
@vaughnbetz suggested testing whether adding LUT equivalence would enable faster congestion avoidance on the 7-series graph. This issue for tracking the results of the router behavior in this case.