The-OpenROAD-Project / OpenROAD

OpenROAD's unified application implementing an RTL-to-GDS Flow. Documentation at https://openroad.readthedocs.io/en/latest/
https://theopenroadproject.org/
BSD 3-Clause "New" or "Revised" License
1.61k stars 560 forks source link

Disproportionate runtime in repair_antennas GRT update after DRT #5668

Closed gatecat closed 2 months ago

gatecat commented 2 months ago

Describe the bug

After https://github.com/The-OpenROAD-Project/OpenROAD/pull/5657 the crash in DRT after antenna fixing in gf130 is fixed. But unfortunately on subsequent iterations repair_antennas gets stuck in updating the global routes (I gave up after an hour and a half or so, for comparison the whole build runtime for the design is usually about 20 minutes). The typical backtrace where it's stuck looks like:

#3  grt::FastRouteCore::mazeRouteMSMD (this=this@entry=0x63f7803b0250, iter=iter@entry=28, expand=expand@entry=144, cost_height=195, 
    ripup_threshold=ripup_threshold@entry=-1, maze_edge_threshold=0, ordering=ordering@entry=false, cost_type=1, 
    logis_cof=logis_cof@entry=0.573903441, via=0, slope=3, L=1, slack_th=@0x7ffdddbf9f78: -3.40282347e+38)
    at /home/gatecat/OpenROAD-upstream/src/grt/src/fastroute/src/maze.cpp:1473
#4  0x000063f758d1ffb7 in grt::FastRouteCore::run (this=0x63f7803b0250)
    at /home/gatecat/OpenROAD-upstream/src/grt/src/fastroute/src/FastRoute.cpp:1199
#5  0x000063f758cf2c10 in grt::GlobalRouter::findRouting (this=this@entry=0x63f7801d8f60, 
    nets=std::vector of length 1376, capacity 2048 = {...}, min_routing_layer=2, max_routing_layer=6)
    at /home/gatecat/OpenROAD-upstream/src/grt/src/GlobalRouter.cpp:446
#6  0x000063f758cff320 in grt::GlobalRouter::updateDirtyRoutes (this=0x63f7801d8f60, save_guides=save_guides@entry=false)
    at /home/gatecat/OpenROAD-upstream/src/grt/src/GlobalRouter.cpp:4493
#7  0x000063f758d018a7 in grt::IncrementalGRoute::updateRoutes (this=0x7ffdddbfa5a0, save_guides=false)
    at /home/gatecat/OpenROAD-upstream/src/grt/src/GlobalRouter.cpp:4441
#8  grt::GlobalRouter::repairAntennas (this=0x63f7801d8f60, diode_mterm=0x63f7807d5358, diode_mterm@entry=0x0, iterations=iterations@entry=5, 
    ratio_margin=ratio_margin@entry=0, num_threads=num_threads@entry=16) at /home/gatecat/OpenROAD-upstream/src/grt/src/GlobalRouter.cpp:403

Expected Behavior

GRT updates in a reasonable amount of time (say, 5 minutes max for antenna repair on this design).

Environment

Git commit: 23035d06ee55e05fd53cc6ba21834551006836bd
kernel: Linux 6.10.6-arch1-1
os: Arch Linux 
cmake version 3.30.2
CMake Warning at CMakeLists.txt:106 (message):
  OpenROAD git describe failed, using sha1 instead

-- The CXX compiler identification is GNU 14.2.1

To Reproduce

This issue can be somewhat reproduced on aes_sky130 with the same flow.tcl patch as in https://github.com/The-OpenROAD-Project/OpenROAD/issues/5565, although it's not as dramatic as getting stuck for hours it still takes over 10 minutes on a design that usually takes less than 5 minutes for the entire build (and does get stuck in the same place). Hopefully this is enough to find where the performance problem lies, if not I can look at creating another testcase for it.

Relevant log output

No response

Screenshots

No response

Additional Context

No response

eder-matheus commented 2 months ago

From what I saw in your last issue, you're running the repair_antennas command with 5 iterations. This works for GRT repair, but for post-DRT repair, you might try something like this:

while {[check_antennas]} {
  remove_fillers
  foreach inst [[ord::get_db_block] getInsts] {
    $inst setPlacementStatus "FIRM"
  }
  repair_antennas
  detailed_route {*}$all_args
}

The problem is that with the detailed routing of the design, GRT have less resources to work on, making it difficult for the maze routing to find a solution without congestion.

I will try it with the test case from issue https://github.com/The-OpenROAD-Project/OpenROAD/issues/5565 and see how it goes.

eder-matheus commented 2 months ago

@gatecat In your private design with this issue, are you running it with OpenROAD-flow-scripts? Or do you run in a similar way to the ibex_sky130hd design, using the flow.tcl file?

eder-matheus commented 2 months ago

@gatecat FYI, this is the procedure that I used to repair_antennas post-DRT. Together with PR https://github.com/The-OpenROAD-Project/OpenROAD/pull/5671, I was able to finish DRT with zero antenna violations in less than 12 minutes (total design flow).

Could you give it a try? Let me know if you're using ORFS in your design. I can adapt this for the detail_route.tcl script from this repository.

set repair_antennas_iters 0
remove_fillers
while {[check_antennas] && $repair_antennas_iters < 5} {
  puts "Iter $repair_antennas_iters"
  foreach inst [[ord::get_db_block] getInsts] {
    $inst setPlacementStatus "FIRM"
  }

  repair_antennas

  detailed_route -output_drc [make_result_file "${design}_${platform}_ant_fix_route_drc.rpt"] \
                 -output_maze [make_result_file "${design}_${platform}_ant_fix_maze.log"] \
                 -save_guide_updates \
                 -bottom_routing_layer $min_routing_layer \
                 -top_routing_layer $max_routing_layer \
                 -verbose 0

  incr repair_antennas_iters
}

filler_placement $filler_cells

set repaired_db [make_result_file ${design}_${platform}_repaired_drt.odb]
write_db $repaired_db

## ****
gatecat commented 2 months ago

Thanks, I was using ORFS but I managed to adapt the detail_route.tcl script with an equivalent patch. Combined with #5671 this is now working perfectly, with it repair_antennas doesn't take more than 30 seconds - negligible compared to total DRT time (about 30 minutes which is acceptable for the scale of the design).

After 5 iterations over 600 antenna violations are reduced to just 3 very marginal ones, which is well within what is acceptable. Maybe a fully clean solution could be obtained with more iterations or tweaking the layer derating so that post-GRT fixup is more effective, but this is more than good enough for now.

Thanks for all your development work and advice on this issue, it's been much appreciated!