The-OpenROAD-Project / OpenROAD

OpenROAD's unified application implementing an RTL-to-GDS Flow. Documentation at https://openroad.readthedocs.io/en/latest/
https://theopenroadproject.org/
BSD 3-Clause "New" or "Revised" License
1.38k stars 485 forks source link

global route fails #5085

Closed oharboe closed 3 weeks ago

oharboe commented 3 weeks ago

Describe the bug

Unclear why this global route fails.

untar https://drive.google.com/file/d/1ISEUv6mBhmc2FQAYpRdz4Hj35kae0odG/view?usp=sharing

Run:

$ ./run-me-L1MetadataArray-asap7-test.sh 
OpenROAD v2.0-13652-g6fc686431 
[deleted]
Repair setup and hold violations...
[INFO RSZ-0094] Found 2 endpoints with setup violations.
Iteration | Resized | Buffers | Cloned Gates | Pin Swaps |   WNS   |   TNS   | Endpoint
---------------------------------------------------------------------------------------
        0 |       0 |       0 |            0 |         0 | -30.120 | -56.365 | io_read_ready
        4 |       0 |       3 |            0 |         2 |   2.896 |   0.000 | tag_array_ext/R0_en
    final |       0 |       3 |            0 |         2 |   2.896 |   0.000 | tag_array_ext/R0_en
---------------------------------------------------------------------------------------
[INFO RSZ-0040] Inserted 2 buffers.
[INFO RSZ-0043] Swapped pins on 2 instances.
[INFO RSZ-0046] Found 199 endpoints with hold violations.
Iteration | Resized | Buffers | Cloned Gates |   WNS   |   TNS   | Endpoint
---------------------------------------------------------------------------
        0 |       0 |       0 |            0 | -376.251 | -37698.145 | tag_array_ext/W0_addr[4]
[ERROR GRT-0232] Routing congestion too high. Check the congestion heatmap in the GUI.
Error: global_route.tcl, 85 GRT-0232
openroad> 

Not clear without a congestion.rpt where the problem is:

image

Expected Behavior

Global routing should work.

Environment

OpenROAD v2.0-13652-g6fc686431

To Reproduce

See above

Relevant log output

No response

Screenshots

No response

Additional Context

Increasing the MACRO_PLACE_HALO from 20 to 30 fixes the global route issue.

image

oharboe commented 3 weeks ago

@eder-matheus I saw your #5086 fix 👍

Other than that, is this an interesting global route test case?

eder-matheus commented 3 weeks ago

@eder-matheus I saw your #5086 fix 👍

Other than that, is this an interesting global route test case?

It actually is. The congestion happens during the incremental grt, so there is only a subset of nets being routed. The congested region is pretty small, with only two gcells near the macro: image

Also, there are a few cells illegally placed, so I wonder if there's something wrong with our grt script. I still looking at it.

oharboe commented 3 weeks ago

Ref. @eder-matheus, there are two issues here.

  1. missing congestion.rpt. fixed.
  2. global routing should have worked.

Reopening to make sure the second problem does not slip between the cracks.

oharboe commented 3 weeks ago

@eder-matheus Is this a global route or a placement problem?

eder-matheus commented 3 weeks ago

@eder-matheus Is this a global route or a placement problem?

The problem is that repair_timing inserts lots of buffers between the macro and the die boundary. The buffers close to the macro are especially difficult for grt.

Most of these buffers would be legalized after repair_timing, but during it, we don't call DPL to avoid runtime issues. A possible fix would be update repair_timing to avoid placing buffers in the macro halos.

Pre repair_timing image

Post repair_timing image

oharboe commented 3 weeks ago

@eder-matheus Sounds like you have this under control. I suppose this might be tracked elsewhere?

This issue is not affecting me at the moment, please feel free to close this issue for my part or keep it open if that is helpful.

eder-matheus commented 3 weeks ago

@eder-matheus Sounds like you have this under control. I suppose this might be tracked elsewhere?

This issue is not affecting me at the moment, please feel free to close this issue for my part or keep it open if that is helpful.

Sounds good, I'll close this issue and create a new one specifically for the repair_timing issue. I'll let you know when it's fixed.