The-OpenROAD-Project / OpenROAD

OpenROAD's unified application implementing an RTL-to-GDS Flow. Documentation at https://openroad.readthedocs.io/en/latest/
https://theopenroadproject.org/
BSD 3-Clause "New" or "Revised" License
1.54k stars 540 forks source link

Crash in GlobalRouter repairAntennas #4878

Closed MichaelBell closed 4 months ago

MichaelBell commented 6 months ago

Describe the bug

While running the Global Routing step, on certain designs a crash occurs in repair antennas. This only seems to happen if more than one iteration is required. This seems very similar to #4471, but I have confirmed it still happens with a very recent version of OpenROAD.

Expected Behavior

The global router should not crash

OpenROAD Environment

I'm using OpenROAD through an OpenLane docker image efabless/openlane-tools:openroad_app-9741bc497a3d5f434b0dacdc3a05af4ebe78846c-centos-7-amd64 using OpenROAD 

But this happens with various versions of OpenROAD, including 6ff71c8822a72dc0c33dd5a8a10076d8691b5a4f and 0889970d1790a2617e69f253221b8bd7626e51dc (used in OpenLane tag 2024.03.07 and 2024.03.12).

It does not happen with 6f9b2bb8b808b1bb5831d4525d868212ae50517a (used in OpenLane tag 2023.11.23).

OpenLane Environment

open_pdks cd1748bb197f9b7af62a54507de6624e30363943
Kernel: Linux v5.15.146.1-microsoft-standard-WSL2
Distribution: ubuntu 22.04
Python: v3.10.12 (OK)
Container Engine: docker v20.10.25 (OK)
OpenLane Git Version: fix-antenna-crash-dev
python-venv: INSTALLED
---
PDK Version Verification Status: OK
---
Git Log (Last 3 Commits)

1648ffd 2024-03-30T19:27:25+00:00 Update OpenROAD again - Mike Bell -  (HEAD -> fix-antenna-crash)
60f6256 2024-03-29T23:08:42+00:00 Update OpenROAD - Mike Bell -  ()
a663df2 2024-03-06T15:47:26+02:00 Update OpenROAD (#2093) - Mohamed Gaber -  (grafted)
---
Git Remotes

mike    git@github.com:MichaelBell/OpenLane.git (fetch)
mike    git@github.com:MichaelBell/OpenLane.git (push)
origin  https://github.com/The-OpenROAD-Project/OpenLane.git (fetch)
origin  https://github.com/The-OpenROAD-Project/OpenLane.git (push)

To Reproduce

Clone the openroad-antenna-bug branch of https://github.com/MichaelBell/tt06-tinyQV recursively Follow https://docs.google.com/document/d/1aUUZ1jthRpg4QURIIyzlOaPWlmQzr-jBn3wZipVUPt4/edit#heading=h.wwc5ldl01nl5 to set up the OpenLane flow for TinyTapeout, with the following changes:

Run:

tt/tt_tool.py --create-user-config
tt/tt_tool.py --harden

I've also attached a zip of the issue reproducible folder. issue_reproducible.zip

Relevant log output

From 19-global.log:

[INFO GRT-0018] Total wirelength: 268085 um
[INFO GRT-0014] Routed nets: 5770
[INFO GRT-0006] Repairing antennas, iteration 1.
[INFO GRT-0043] No OR_DEFAULT vias defined.
[INFO GRT-0012] Found 32 antenna violations.
[INFO GRT-0015] Inserted 36 diodes.
[INFO GRT-0009] rerouting 4422 nets.
[INFO GRT-0001] Minimum degree: 2
[INFO GRT-0002] Maximum degree: 26
[INFO GRT-0006] Repairing antennas, iteration 2.
[INFO GRT-0043] No OR_DEFAULT vias defined.
[INFO GRT-0012] Found 17 antenna violations.
[INFO GRT-0015] Inserted 21 diodes.
[INFO GRT-0009] rerouting 550 nets.
[INFO GRT-0001] Minimum degree: 2
[INFO GRT-0002] Maximum degree: 16
[INFO GRT-0006] Repairing antennas, iteration 3.
[INFO GRT-0043] No OR_DEFAULT vias defined.
Signal 11 received
Stack trace:
 0# 0x0000000000D8E2B7 in openroad
 1# 0x00007F62E3B66400 in /lib64/libc.so.6
 2# odb::_dbBox::isOct() const in openroad
 3# odb::dbBox::getBox() in openroad
 4# odb::dbWirePathItr::getNextShape(odb::dbWirePathShape&) in openroad
 5# odb::tmg_conn::loadWire(odb::dbWire*) in openroad
 6# odb::tmg_conn::analyzeNet(odb::dbNet*) in openroad
 7# odb::orderWires(utl::Logger*, odb::dbNet*) in openroad
 8# grt::RepairAntennas::makeNetWire(odb::dbNet*, std::vector<grt::GSegment, std::allocator<grt::GSegment> >&, std::map<int, odb::dbTechVia*, std::less<int>, std::allocator<std::pair<int const, odb::dbTechVia*> > >&) in openroad
 9# grt::RepairAntennas::makeNetWires(std::map<odb::dbNet*, std::vector<grt::GSegment, std::allocator<grt::GSegment> >, grt::cmpById, std::allocator<std::pair<odb::dbNet* const, std::vector<grt::GSegment, std::allocator<grt::GSegment> > > > >&, int) in openroad
10# grt::RepairAntennas::checkAntennaViolations(std::map<odb::dbNet*, std::vector<grt::GSegment, std::allocator<grt::GSegment> >, grt::cmpById, std::allocator<std::pair<odb::dbNet* const, std::vector<grt::GSegment, std::allocator<grt::GSegment> > > > >&, int, odb::dbMTerm*, float) in openroad
11# grt::GlobalRouter::repairAntennas(odb::dbMTerm*, int, float) in openroad
12# 0x00000000012E4276 in openroad

OpenLane console log:
(venv) ~/tt/tt06-tinyQV$ tt/tt_tool.py --harden
OpenLane 1648ffd17a9430240eb753e4bb85b67ff99dc568
All rights reserved. (c) 2020-2023 Efabless Corporation and contributors.
Available under the Apache License, version 2.0. See the LICENSE file for more details.

[INFO]: Using configuration in '../work/src/config.tcl'...
[INFO]: PDK Root: /home/mdb36/tt/pdk
[INFO]: Process Design Kit: sky130A
[INFO]: Standard Cell Library: sky130_fd_sc_hd
[INFO]: Optimization Standard Cell Library: sky130_fd_sc_hd
[INFO]: Run Directory: /work/runs/wokwi
[INFO]: Removing existing /work/runs/wokwi...
[WARNING]: The variable name DESIGN_IS_CORE was renamed to FP_PDN_MULTILAYER. Update your configuration file.
[INFO]: Saving runtime environment...
[INFO]: Preparing LEF files for the nom corner...
[INFO]: Preparing LEF files for the min corner...
[INFO]: Preparing LEF files for the max corner...
[INFO]: Running linter (Verilator) (log: ../work/runs/wokwi/logs/synthesis/linter.log)...
[INFO]: 0 errors found by linter
[INFO]: 0 warnings found by linter
[STEP 1]
[INFO]: Running Synthesis (log: ../work/runs/wokwi/logs/synthesis/1-synthesis.log)...
[STEP 2]
[INFO]: Running Single-Corner Static Timing Analysis (log: ../work/runs/wokwi/logs/synthesis/2-sta.log)...
[STEP 3]
[INFO]: Running Initial Floorplanning (log: ../work/runs/wokwi/logs/floorplan/3-initial_fp.log)...
[INFO]: Floorplanned with width 329.36 and height 220.32.
[STEP 4]
[INFO]: Running IO Placement (log: ../work/runs/wokwi/logs/floorplan/4-io.log)...
[INFO]: Applying DEF template...
[STEP 5]
[INFO]: Running Tap/Decap Insertion (log: ../work/runs/wokwi/logs/floorplan/5-tap.log)...
[INFO]: Power planning with power {VPWR} and ground {VGND}...
[STEP 6]
[INFO]: Generating PDN (log: ../work/runs/wokwi/logs/floorplan/6-pdn.log)...
[STEP 7]
[INFO]: Running Global Placement (log: ../work/runs/wokwi/logs/placement/6-global.log)...
[STEP 8]
[INFO]: Running Single-Corner Static Timing Analysis (log: ../work/runs/wokwi/logs/placement/8-gpl_sta.log)...
[STEP 9]
[INFO]: Running Placement Resizer Design Optimizations (log: ../work/runs/wokwi/logs/placement/9-resizer.log)...
[STEP 10]
[INFO]: Running Detailed Placement (log: ../work/runs/wokwi/logs/placement/10-detailed.log)...
[STEP 11]
[INFO]: Running Single-Corner Static Timing Analysis (log: ../work/runs/wokwi/logs/placement/11-dpl_sta.log)...
[STEP 12]
[INFO]: Running Clock Tree Synthesis (log: ../work/runs/wokwi/logs/cts/12-cts.log)...
[STEP 13]
[INFO]: Running Single-Corner Static Timing Analysis (log: ../work/runs/wokwi/logs/cts/13-cts_sta.log)...
[STEP 14]
[INFO]: Running Placement Resizer Timing Optimizations (log: ../work/runs/wokwi/logs/cts/14-resizer.log)...
[STEP 15]
[INFO]: Running Global Routing Resizer Design Optimizations (log: ../work/runs/wokwi/logs/routing/15-resizer_design.log)...
[STEP 16]
[INFO]: Running Single-Corner Static Timing Analysis (log: ../work/runs/wokwi/logs/routing/16-rsz_design_sta.log)...
[STEP 17]
[INFO]: Running Global Routing Resizer Timing Optimizations (log: ../work/runs/wokwi/logs/routing/17-resizer_timing.log)...
[STEP 18]
[INFO]: Running Single-Corner Static Timing Analysis (log: ../work/runs/wokwi/logs/routing/18-rsz_timing_sta.log)...
[STEP 19]
[INFO]: Running Global Routing (log: ../work/runs/wokwi/logs/routing/19-global.log)...
[ERROR]: during executing openroad script /openlane/scripts/openroad/repair_antennas.tcl
[ERROR]: Log: ../work/runs/wokwi/logs/routing/19-global.log
[ERROR]: Last 10 lines:
38# 0x00007F62E869BF1E in /lib64/libtcl8.5.so
39# Tcl_EvalEx in /lib64/libtcl8.5.so
40# Tcl_Eval in /lib64/libtcl8.5.so
41# sta::sourceTclFile(char const*, bool, bool, Tcl_Interp*) in openroad
42# ord::tclAppInit(Tcl_Interp*) in openroad
43# Tcl_Main in /lib64/libtcl8.5.so
44# main in openroad
45# __libc_start_main in /lib64/libc.so.6
46# 0x0000000000D85757 in openroad
child killed: segmentation violation

[ERROR]: Creating issue reproducible...
[INFO]: Saving runtime environment...
OpenLane TCL Issue Packager

EFABLESS CORPORATION AND ALL AUTHORS OF THE OPENLANE PROJECT SHALL NOT BE HELD
LIABLE FOR ANY LEAKS THAT MAY OCCUR TO ANY PROPRIETARY DATA AS A RESULT OF USING
THIS SCRIPT. THIS SCRIPT IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OR
CONDITIONS OF ANY KIND.

BY USING THIS SCRIPT, YOU ACKNOWLEDGE THAT YOU FULLY UNDERSTAND THIS DISCLAIMER
AND ALL IT ENTAILS.

Parsing config file(s)…
Setting up /work/runs/wokwi/issue_reproducible…
Done.
[INFO]: Reproducible packaged at '../work/runs/wokwi/issue_reproducible'.
Traceback (most recent call last):
  File "/home/mdb36/tt/tt06-tinyQV/tt/tt_tool.py", line 150, in <module>
    project.harden()
  File "/home/mdb36/tt/tt06-tinyQV/tt/project.py", line 481, in harden
    with open(os.path.join(self.local_dir, commit_id_json_path), "w") as f:
FileNotFoundError: [Errno 2] No such file or directory: './runs/wokwi/results/final/commit_id.json'

Screenshots

No response

Additional Context

This issue also caused failures for certain Tiny Tapeout 05 projects (that previously succeeded) when testing the 2024.03.12 tag of OpenLane: https://github.com/TinyTapeout/tinytapeout-05-reharden/actions/runs/8264211005/job/22609263738 https://github.com/TinyTapeout/tinytapeout-05-reharden/actions/runs/8264211005/job/22609256508

maliberty commented 6 months ago

@luis201420 is working on rewriting this chunk of code. I believe that will solve you problem but it will be a bit until the new code is ready.

maliberty commented 6 months ago

The general goal is to no longer call orderWires during antenna repair

dlmiles commented 6 months ago

When investigating this, the following assertion is triggered.

Is the frame posted before (containing db::orderWires(...) call a symptom, while this frame is closer to the cause ?

@maliberty no sign of db::orderWires() here, please re-evaluate if the removal of the call will address this problem, or if I should continue to investigate. Thanks

FWIW this problem/frame that looks like this can be demonstrated in many versions of OR going back the past 6 to 9 months. High congestion maybe a contributing factor.

[INFO GRT-0006] Repairing antennas, iteration 3.
[INFO GRT-0043] No OR_DEFAULT vias defined.
openroad: /openroad/src/odb/src/db/dbWireCodec.cpp:393: int odb::dbWireEncoder::addTechVia(odb::dbTechVia*): Assertion `0' failed.
Signal 6 received
Stack trace:
 0# 0x0000000004065175 in openroad
 1# 0x00007F5413F2B400 in /lib64/libc.so.6
 2# gsignal in /lib64/libc.so.6
 3# abort in /lib64/libc.so.6
 4# 0x00007F5413F241A6 in /lib64/libc.so.6
 5# 0x00007F5413F24252 in /lib64/libc.so.6
 6# odb::dbWireEncoder::addTechVia(odb::dbTechVia*) in openroad
 7# grt::RepairAntennas::addWireTerms(grt::Net*, std::vector<grt::GSegment, std::allocator<grt::GSegment> >&, int, int, int, odb::dbTechLayer*, std::map<grt::RoutePt, grt::RoutePtPins, std::less<grt::RoutePt>, std::allocator<std::pair<grt::RoutePt const, grt::RoutePtPins> > >&, odb::dbWireEncoder&, std::map<int, odb::dbTechVia*, std::less<int>, std::allocator<std::pair<int const, odb::dbTechVia*> > >&, bool) in openroad
 8# grt::RepairAntennas::makeNetWire(odb::dbNet*, std::vector<grt::GSegment, std::allocator<grt::GSegment> >&, std::map<int, odb::dbTechVia*, std::less<int>, std::allocator<std::pair<int const, odb::dbTechVia*> > >&) in openroad
 9# grt::RepairAntennas::makeNetWires(std::map<odb::dbNet*, std::vector<grt::GSegment, std::allocator<grt::GSegment> >, grt::cmpById, std::allocator<std::pair<odb::dbNet* const, std::vector<grt::GSegment, std::allocator<grt::GSegment> > > > >&, int) in openroad
10# grt::RepairAntennas::checkAntennaViolations(std::map<odb::dbNet*, std::vector<grt::GSegment, std::allocator<grt::GSegment> >, grt::cmpById, std::allocator<std::pair<odb::dbNet* const, std::vector<grt::GSegment, std::allocator<grt::GSegment> > > > >&, int, odb::dbMTerm*, float) in openroad
11# grt::GlobalRouter::repairAntennas(odb::dbMTerm*, int, float) in openroad
12# grt::repair_antennas(odb::dbMTerm*, int, float) in openroad
...
maliberty commented 6 months ago

The problem is order wires and is understood.

MichaelBell commented 6 months ago

This is causing problems with a number of Tiny Tapeout 6 designs, now that Tiny Tapeout has moved to an OpenLane version with the bug present.

This patch seems to work (or at least resolve the crash) on at least a couple of designs, suggesting the problem might be happening in makeNetWires, as dlmiles suggested above. But none of us in the Tiny Tapeout community fully understand the implications. Any comment on whether this is a good idea would be welcome!

https://github.com/obriensp/OpenROAD/commit/d1f0b60d7b31393ec0afb8f96e1ca02bb465b3fc

maliberty commented 6 months ago

We are in the midst of fixing another bug in the same spot of code so results will probably change again. In general such things are prone to fix one design and break another due to the brittle nature of order_wires. The real solution is still in progress but you should use whatever local methods you can until then.

dlmiles commented 5 months ago

obriensp@d1f0b60

This recent fix just landed in OR looks the same to me as the patch from this comment I quote here.

https://github.com/The-OpenROAD-Project/OpenROAD/commit/b98aca833f9827e7f20a8fa3c6defa56fa92e506#diff-02f10642344ceceb7764f552b99c58c78482dbd86eff6a8c6c17aae6407b7d70R304

eder-matheus commented 4 months ago

@MichaelBell Could you try the latest version of OpenROAD? We've merged the new antenna checker code that don't use orderWires, so this crash should not be happening anymore.

I've tried to run your attached testcase, but it fails with a missing SDC file.

MichaelBell commented 4 months ago

Thanks, I'll take a look.

At a guess of what's going wrong with the SDC file - it should be using base.sdc from my src directory, but that does read_sdc $::env(OPENLANE_ROOT)/scripts/base.sdc, maybe OPENLANE_ROOT isn't set in your environment?

eder-matheus commented 4 months ago

Thanks, I'll take a look.

At a guess of what's going wrong with the SDC file - it should be using base.sdc from my src directory, but that does read_sdc $::env(OPENLANE_ROOT)/scripts/base.sdc, maybe OPENLANE_ROOT isn't set in your environment?

Yeah, that's the case. I tried to find the file under openlane/scripts but couldn't see it.

MichaelBell commented 4 months ago

Weird, it should be here: https://github.com/The-OpenROAD-Project/OpenLane/blob/master/scripts/base.sdc

eder-matheus commented 4 months ago

Weird, it should be here: https://github.com/The-OpenROAD-Project/OpenLane/blob/master/scripts/base.sdc

@MichaelBell It wasn't in the test case you attached. Either way, I'll close this issue for now, but if you have any other crashes, feel free to reopen it or create a new issue.