The-OpenROAD-Project / OpenROAD

OpenROAD's unified application implementing an RTL-to-GDS Flow. Documentation at https://openroad.readthedocs.io/en/latest/
https://theopenroadproject.org/
BSD 3-Clause "New" or "Revised" License
1.64k stars 564 forks source link

CTS WNS drops in timing repair before ending up roughly at where it started #6034

Open oharboe opened 4 weeks ago

oharboe commented 4 weeks ago

Describe the bug

Here WNS gets worse in timing repair in CTS before bouncing back up to roughly where it started.

image

untar https://drive.google.com/file/d/1ICisYfJdyxyJbMKcgs6JkNTB2IT05afA/view?usp=sharing

./run-me-BoomTile-asap7-4.sh
OpenROAD v2.0-16787-gcd519bb5e
Features included (+) or not (-): +Charts +GPU +GUI +Python
This program is licensed under the BSD-3 license. See the LICENSE file for details.
Components of this program may be licensed under more restrictive licenses which must be honored.
[INFO ORD-0030] Using 44 thread(s).
clock_tree_synthesis -sink_clustering_enable -balance_levels
[INFO CTS-0050] Root buffer is BUFx24_ASAP7_75t_R.
[INFO CTS-0051] Sink buffer is BUFx24_ASAP7_75t_R.
[INFO CTS-0052] The following clock buffers will be used for CTS:
                    BUFx24_ASAP7_75t_R
[INFO CTS-0049] Characterization buffer is BUFx24_ASAP7_75t_R.
[INFO CTS-0007] Net "clock" found for clock "clock".
[INFO CTS-0011]  Clock net "clock" for macros has 123 sinks.
[INFO CTS-0011]  Clock net "clock_regs" for registers has 118471 sinks.
[INFO CTS-0008] TritonCTS found 2 clock nets.
[INFO CTS-0097] Characterization used 1 buffer(s) types.
[INFO CTS-0200] 40 placement blockages have been identified.
[INFO CTS-0027] Generating H-Tree topology for net clock.
[INFO CTS-0028]  Total number of sinks: 123.
[INFO CTS-0090]  Sinks will be clustered based on buffer max cap.
[INFO CTS-0030]  Number of static layers: 0.
[INFO CTS-0020]  Wire segment unit: 1350  dbu (1 um).
[INFO CTS-0023]  Original sink region: [(25141, 88188), (1974911, 1973484)].
[INFO CTS-0024]  Normalized sink region: [(18.623, 65.3244), (1462.9, 1461.84)].
[INFO CTS-0025]     Width:  1444.2741.
[INFO CTS-0026]     Height: 1396.5156.
 Level 1
    Direction: Horizontal
    Sinks per sub-region: 62
    Sub-region size: 722.1370 X 1396.5156
[INFO CTS-0034]     Segment length (rounded): 362.
 Level 2
    Direction: Vertical
    Sinks per sub-region: 31
    Sub-region size: 722.1370 X 698.2578
[INFO CTS-0034]     Segment length (rounded): 350.
 Level 3
    Direction: Horizontal
    Sinks per sub-region: 16
    Sub-region size: 361.0685 X 698.2578
[INFO CTS-0034]     Segment length (rounded): 180.
 Level 4
    Direction: Vertical
    Sinks per sub-region: 8
    Sub-region size: 361.0685 X 349.1289
[INFO CTS-0034]     Segment length (rounded): 174.
[INFO CTS-0032]  Stop criterion found. Max number of sinks is 15.
[INFO CTS-0035]  Number of sinks covered: 123.
[INFO CTS-0200] 40 placement blockages have been identified.
[INFO CTS-0027] Generating H-Tree topology for net clock_regs.
[INFO CTS-0028]  Total number of sinks: 118471.
[INFO CTS-0090]  Sinks will be clustered based on buffer max cap.
[INFO CTS-0030]  Number of static layers: 0.
[INFO CTS-0020]  Wire segment unit: 1350  dbu (1 um).
[INFO CTS-0206] Best clustering solution was found from clustering size of 30 and clustering diameter of 50.
[INFO CTS-0019]  Total number of sinks after clustering: 7317.
[INFO CTS-0024]  Normalized sink region: [(7.89957, 8.1), (1439.28, 1464.44)].
[INFO CTS-0025]     Width:  1431.3801.
[INFO CTS-0026]     Height: 1456.3355.
 Level 1
    Direction: Vertical
    Sinks per sub-region: 3659
    Sub-region size: 1431.3801 X 728.1678
[INFO CTS-0034]     Segment length (rounded): 364.
 Level 2
    Direction: Horizontal
    Sinks per sub-region: 1830
    Sub-region size: 715.6900 X 728.1678
[INFO CTS-0034]     Segment length (rounded): 358.
 Level 3
    Direction: Vertical
    Sinks per sub-region: 915
    Sub-region size: 715.6900 X 364.0839
[INFO CTS-0034]     Segment length (rounded): 182.
 Level 4
    Direction: Horizontal
    Sinks per sub-region: 458
    Sub-region size: 357.8450 X 364.0839
[INFO CTS-0034]     Segment length (rounded): 178.
 Level 5
    Direction: Vertical
    Sinks per sub-region: 229
    Sub-region size: 357.8450 X 182.0419
[INFO CTS-0034]     Segment length (rounded): 92.
 Level 6
    Direction: Horizontal
    Sinks per sub-region: 115
    Sub-region size: 178.9225 X 182.0419
[INFO CTS-0034]     Segment length (rounded): 90.
 Level 7
    Direction: Vertical
    Sinks per sub-region: 58
    Sub-region size: 178.9225 X 91.0210
[INFO CTS-0034]     Segment length (rounded): 46.
 Level 8
    Direction: Horizontal
    Sinks per sub-region: 29
    Sub-region size: 89.4613 X 91.0210
[INFO CTS-0034]     Segment length (rounded): 44.
 Level 9
    Direction: Vertical
    Sinks per sub-region: 15
    Sub-region size: 89.4613 X 45.5105
[INFO CTS-0034]     Segment length (rounded): 22.
 Level 10
    Direction: Horizontal
    Sinks per sub-region: 8
    Sub-region size: 44.7306 X 45.5105
[INFO CTS-0034]     Segment length (rounded): 22.
[INFO CTS-0032]  Stop criterion found. Max number of sinks is 15.
[INFO CTS-0035]  Number of sinks covered: 7317.
[INFO CTS-0018]     Created 195 clock buffers.
[INFO CTS-0012]     Minimum number of buffers in the clock path: 34.
[INFO CTS-0013]     Maximum number of buffers in the clock path: 34.
[INFO CTS-0015]     Created 195 clock nets.
[INFO CTS-0016]     Fanout distribution for the current clock = 4:1, 5:2, 6:4, 7:3, 8:1, 9:1, 10:2, 11:1, 16:1..
[INFO CTS-0017]     Max level of the clock tree: 4.
[INFO CTS-0018]     Created 9960 clock buffers.
[INFO CTS-0012]     Minimum number of buffers in the clock path: 46.
[INFO CTS-0013]     Maximum number of buffers in the clock path: 47.
[INFO CTS-0015]     Created 9960 clock nets.
[INFO CTS-0016]     Fanout distribution for the current clock = 1:72, 2:114, 3:112, 4:132, 5:126, 6:166, 7:168, 8:184, 9:175, 10:145, 11:181, 12:254, 13:290, 14:447, 15:680, 16:925, 17:1061, 18:1125, 19:930, 20:573, 21:279, 22:117, 23:41, 24:9, 25:5, 26:1, 28:2, 30:1..
[INFO CTS-0017]     Max level of the clock tree: 10.
[INFO CTS-0098] Clock net "clock"
[INFO CTS-0099]  Sinks 123
[INFO CTS-0100]  Leaf buffers 0
[INFO CTS-0101]  Average sink wire length 2932.19 um
[INFO CTS-0102]  Path depth 34 - 34
[INFO CTS-0207]  Leaf load cells 6483
[INFO CTS-0098] Clock net "clock_regs"
[INFO CTS-0099]  Sinks 124954
[INFO CTS-0100]  Leaf buffers 7305
[INFO CTS-0101]  Average sink wire length 1938.20 um
[INFO CTS-0102]  Path depth 42 - 47
[INFO CTS-0207]  Leaf load cells 6483
[INFO RSZ-0058] Using max wire length 162um.
[INFO RSZ-0047] Found 148 long wires.
[INFO RSZ-0048] Inserted 533 buffers in 148 nets.
Placement Analysis
---------------------------------
total displacement      14929.6 u
average displacement        0.0 u
max displacement          200.9 u
original HPWL        19189657.6 u
legalized HPWL       19327208.8 u
delta HPWL                    1 %

repair_timing -verbose -setup_margin 0 -hold_margin -200 -repair_tns 0 -skip_last_gasp
[INFO RSZ-0094] Found 75145 endpoints with setup violations.
[INFO RSZ-0099] Repairing 1 out of 75145 (0.00%) violating endpoints...
   Iter   | Removed | Resized | Inserted | Cloned |  Pin  |    WNS   |   TNS      |  Viol  | Worst
          | Buffers |  Gates  | Buffers  |  Gates | Swaps |          |            | Endpts | Endpt
---------------------------------------------------------------------------------------------------
        0 |       0 |       0 |        0 |      0 |     0 | -3489.281 | -107827944.0 |  75145 | frontend.bpd.banked_predictors_0.btb.ebtb_ext/R0_addr[1]
       10 |       5 |       0 |        5 |      0 |     2 | -4192.852 | -107095248.0 |  75145 | frontend.bpd.banked_predictors_0.btb.ebtb_ext/R0_addr[1]
       20 |       7 |       3 |        7 |      0 |     6 | -4865.464 | -109202720.0 |  75145 | frontend.bpd.banked_predictors_0.btb.ebtb_ext/R0_addr[0]
       30 |       8 |       6 |       10 |      0 |    11 | -4906.467 | -109337392.0 |  75145 | frontend.bpd.banked_predictors_0.btb.ebtb_ext/R0_addr[1]
       40 |      15 |       8 |       10 |      0 |    12 | -6793.429 | -114285952.0 |  75145 | frontend.bpd.banked_predictors_0.btb.ebtb_ext/R0_addr[1]
       50 |      17 |       9 |       12 |      0 |    18 | -6925.218 | -114462032.0 |  75145 | frontend.bpd.banked_predictors_0.btb.ebtb_ext/R0_addr[1]
       56 |       1 |       5 |       10 |      0 |     2 | -3282.132 | -104430144.0 |  75145 | frontend.bpd.banked_predictors_0.btb.ebtb_ext/R0_addr[1]
    final |       1 |       5 |       10 |      0 |     2 | -3282.132 | -104430144.0 |  75145 | frontend.bpd.banked_predictors_0.btb.ebtb_ext/R0_addr[1]
---------------------------------------------------------------------------------------------------
[INFO RSZ-0059] Removed 1 buffers.
[INFO RSZ-0040] Inserted 3 buffers.
[INFO RSZ-0041] Resized 5 instances.
[INFO RSZ-0043] Swapped pins on 2 instances.
[WARNING RSZ-0062] Unable to repair all setup violations.
[INFO RSZ-0046] Found 927 endpoints with hold violations.
Iteration | Resized | Buffers | Cloned Gates |   WNS   |   TNS   | Endpoint
---------------------------------------------------------------------------
        0 |       0 |       0 |            0 | -148.180 | -347026.844 | core.fp_rename_stage.freelist.br_alloc_lists_0\[4\]$_DFF_P_/D
       10 |       0 |    9619 |            0 | -325.298 | -253532.938 | core.iregfile.regfile_ext/W5_data[46]
       20 |       0 |    9666 |            0 | -325.298 | -253508.125 | core.iregfile.regfile_ext/W5_data[46]
       30 |       0 |    9689 |            0 | -325.298 | -253493.969 | core.iregfile.regfile_ext/W5_data[46]
    final |       0 |    9697 |            0 | -325.298 | -253489.422 | core.iregfile.regfile_ext/W5_data[46]
---------------------------------------------------------------------------
[WARNING RSZ-0064] Unable to repair all hold checks within margin.
[INFO RSZ-0032] Inserted 9697 hold buffers.
Placement Analysis
---------------------------------
total displacement       9079.6 u
average displacement        0.0 u
max displacement           23.7 u
original HPWL        19338479.8 u
legalized HPWL       19341222.7 u
delta HPWL                    0 %

Elapsed time: 52:04.85[h:]min:sec. CPU time: user 3108.61 sys 16.21 (99%). Peak memory: 41517024KB.

Expected Behavior

I don't know what the expected behavior is here, but I tought it looked interesting enough to file a standalone test-case.

Environment

OpenROAD v2.0-16787-gcd519bb5e

To Reproduce

See above

Relevant log output

No response

Screenshots

No response

Additional Context

No response

maliberty commented 4 weeks ago

We do allow some degradation in order to avoid local minima but this seems a bit extreme.

precisionmoon commented 4 weeks ago

Can you clarify the issue a bit? Are you concerned about setup WNS degradation during hold fixing? Or is this about setup WNS QoR trajectory from floorplanning to CTS?

oharboe commented 4 weeks ago

I dont have any concerns or expectations as I dont know the repair setup algorithm, I just thought the WNS dip during setup fixing might be an interesting test case that merited some further study: do we understand what is happening here?

precisionmoon commented 4 weeks ago

OK. Understood. We'll look into WNS fluctuation during setup fixing.

precisionmoon commented 4 weeks ago

I'm getting an incompatible db schema error after re-building OR with the latest version: Error: cts.tcl, 4 incompatible database schema revision 0.96 > 0.90

maliberty commented 4 weeks ago

That would suggest you have an older version rather than a newer one. What commit id does OR show?

precisionmoon commented 4 weeks ago

OpenROAD v2.0-15833-ge3b4fc0c9

precisionmoon commented 4 weeks ago

Sorry, I had an issue with git pull which caused the db schema error. Now I can reproduce the issue.

oharboe commented 3 weeks ago

Here is another plot from various runs, it looks like the test case in this issue is indeed a bit peculiar and merits some further study:

image