The-OpenROAD-Project / OpenROAD-flow-scripts

OpenROAD's scripts implementing an RTL-to-GDS Flow. Documentation at https://openroad-flow-scripts.readthedocs.io/en/latest/
https://theopenroadproject.org/
Other
308 stars 276 forks source link

'Routing Congestion too high' error at Global Routing step #173

Closed lakshmi-sathi closed 1 year ago

lakshmi-sathi commented 2 years ago

Describe the bug The flow fails for this BlackParrot front-end design (bp_fe) at the Global Routing step with "Routing congestion too high" error. This issue is despite providing a large core area of 11020x8120um. It is using fakerams generated using bsg_fakeram generator as placeholders for its memories. The same routing congestion issue is noted on another BlackParrot design too which uses fakeram macros like this one. The logs of the run for this design are placed here: https://github.com/bsg-idea/bsg_sky130_designs/tree/test_sky130/designs/bp_fe/logs

Expected behavior Routing succeeding without routing congestion or violations.

Environment

File Uploads Just have to place this folder in the 'designs/sky130hd/' directory and run it as usual for reproducing the issue: https://github.com/bsg-idea/bsg_sky130_designs/tree/test_sky130/designs/bp_fe

Additional context The congestion issue does not seem to appear on designs without SRAM macros. Different pin widths, pin spacings, and sizes were tried (both large and small) for the fakerams but still, the issue persists. bsg_cache_dma is a design with a fakeram for which the OpenROAD flow was successful.

@tspyrou @taylor-bsg

maliberty commented 2 years ago

When you have macros you can still have congestion issues in the channels no matter the core size. Have you tested increasing MACRO_PLACE_HALO and MACRO_PLACE_CHANNEL?

tspyrou commented 2 years ago

@lakshmi-sathi please try Matt's suggestion and close if it works or assign to @eder-matheus if it doesn't.

lakshmi-sathi commented 2 years ago

@maliberty Thanks for that pointer. I tried it out. Originally the values were:

export MACRO_PLACE_HALO ?= 1 1
export MACRO_PLACE_CHANNEL ?= 80 80

I changed it to (I am editing directly in the platform/sky130hd/config.mk):

export MACRO_PLACE_HALO ?= 3 3
export MACRO_PLACE_CHANNEL ?= 100 100

Now for some reason it appears to do worse (the overflow values seem larger) and then it terminates with this error:

image

It doesn't seem to be a memory issue as 30GB of RAM is available on the system.

Attaching the logs for this latest run for reference log.tar.gz

vijayank88 commented 2 years ago

@lakshmi-sathi Its seems you just copied nangate45 tech example and trying to use sky130nm technology for GDSII generation. Have you modified configuration file accordingly?

lakshmi-sathi commented 2 years ago

@vijayank88 Yes I am modifying the configuration file but I could use some pointers regarding any variables/options I need to pay attention to. Might you know maybe, of any options that I could try modifying to tackle this congestion issue? ( I have tried density, area, cell pad sites, macro place halo and macro place channel.)

taylor-bsg commented 2 years ago

Hi Lakshmi,

If you have any fixes, can you be sure to update in the repo so others can work off of the latest?

M

On Tue, Oct 5, 2021 at 9:42 AM lakshmi-sathi @.***> wrote:

@vijayank88 When I collected the files again for the design I must have missed it, thanks for pointing that out. I had tried giving custom constraints before but it hadn't solved the issue. Now after I edited the constraints I gave it another try but the issue remains the same. Might you know maybe, of any options that I could try modifying to tackle this congestion issue? ( I have tried density, area, cell pad sites, macro place halo and macro place channel.)

On Mon, 4 Oct 2021 at 13:01, vijayank88 @.***> wrote:

@lakshmi-sathi https://github.com/lakshmi-sathi seems you're trying 45nm tech constraint to 130nm node directly. Maybe slight modification required in config and constraint file as well to avoid congestion issue.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/The-OpenROAD-Project/OpenROAD-flow-scripts/issues/173#issuecomment-933220221 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AN6YU4QB2IYYYHYXVGSHYQDUFFJ6BANCNFSM5FDUMKRQ

. Triage notifications on the go with GitHub Mobile for iOS < https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675

or Android < https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/The-OpenROAD-Project/OpenROAD-flow-scripts/issues/173#issuecomment-934577911, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEFG5AFYODHEDVK4YFMKO3TUFMTGVANCNFSM5FDUMKRQ .

lakshmi-sathi commented 2 years ago

Yes Prof., I will keep the designs in the repo upto date with the changes that I make, and if I am able to get an improved result, I will update the issue with the same.

On Wed, 6 Oct, 2021, 2:01 am taylor-bsg, @.***> wrote:

Hi Lakshmi,

If you have any fixes, can you be sure to update in the repo so others can work off of the latest?

M

On Tue, Oct 5, 2021 at 9:42 AM lakshmi-sathi @.***> wrote:

@vijayank88 When I collected the files again for the design I must have missed it, thanks for pointing that out. I had tried giving custom constraints before but it hadn't solved the issue. Now after I edited the constraints I gave it another try but the issue remains the same. Might you know maybe, of any options that I could try modifying to tackle this congestion issue? ( I have tried density, area, cell pad sites, macro place halo and macro place channel.)

On Mon, 4 Oct 2021 at 13:01, vijayank88 @.***> wrote:

@lakshmi-sathi https://github.com/lakshmi-sathi seems you're trying 45nm tech constraint to 130nm node directly. Maybe slight modification required in config and constraint file as well to avoid congestion issue.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <

https://github.com/The-OpenROAD-Project/OpenROAD-flow-scripts/issues/173#issuecomment-933220221

, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/AN6YU4QB2IYYYHYXVGSHYQDUFFJ6BANCNFSM5FDUMKRQ

. Triage notifications on the go with GitHub Mobile for iOS <

https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675

or Android <

https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/The-OpenROAD-Project/OpenROAD-flow-scripts/issues/173#issuecomment-934577911 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AEFG5AFYODHEDVK4YFMKO3TUFMTGVANCNFSM5FDUMKRQ

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/The-OpenROAD-Project/OpenROAD-flow-scripts/issues/173#issuecomment-934784563, or unsubscribe https://github.com/notifications/unsubscribe-auth/AN6YU4V4XMFBJ66B7B27XMDUFNOAJANCNFSM5FDUMKRQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

lakshmi-sathi commented 2 years ago

@eder-matheus , Hi Two other designs from issue #174 could get through the flow by adjusting the MACRO_PLACE_CHANNEL variable so I tried adjusting the MACRO_PLACE_HALO and MACRO_PLACE_CHANNEL variables some more and I was trying to make sense of what I observed: Increasing the MACRO_PLACE_HALO or MACRO_PLACE_CHANNEL variables reduces the congestion but increasing beyond a point strangely only worsens the congestion (If we increase it shouldn't be causing congestion, am I right?). At a sweet spot of MACRO_PLACE_HALO = 100 100 & MACRO_PLACE_CHANNEL = 200 200, I am able to get the lowest congestion with a final overflow of 85 (Image below)(I have updated this in the repo link given in the main comment)

congestion_bp_fe

This is however with a very low place density of 0.03. Any place density larger than that is increasing the congestion. Could you please take a look and give me an idea what I could be missing here. I am attaching the packed results: global_route_issue.tar.gz

eder-matheus commented 2 years ago

Hi, @lakshmi-sathi. Here is the congestion map of your design after the congestion error: image The red regions are the ones with overflow, and they are all over the macros. Is it possible that you have unreachable pins on these macros?

maliberty commented 2 years ago

@eder-matheus the macros themselves will likely have blockages producing the red color. The red around the edges of the macros is more interesting.

rovinski commented 2 years ago

Looks to me like one of three cases: 1) The macros are strongly connected to the core and not the I/O pins. In this case, it seems like the macros are simply oriented the wrong way and should have their pins facing inwards.

2) The macros are connected strongly to both the I/O and core, but there is room to route over the macros. In this case, the router needs to be allowed to route over the macro.

3) The macros are connected strongly to both the I/O and core, and there is no room to route over the macros. In this case, there is not much that can be done. The macros are simply going to need a lot of space between them to allow for routing. Consider making design changes so that this isn't necessary.

lakshmi-sathi commented 2 years ago

@eder-matheus Thanks for generating that congestion map!

And, thanks all for the responses.

About the macros, they are having all their pins on their left side and each pin is having a width of 900nm and the spacing between the pins is 1900nm or more. Also, they have vertical power stripes on metal 4.

The macro LEFs (fakerams) are made of metal layers 1,2,3 and 4 so the router should be able to route over the macros by default (using metal 5), am I right? And if routing over the macro is possible then the macro orientation should not be that much an issue, isn't it? Is there some option that I need to set to allow routing over the macros?

rovinski commented 2 years ago

The macro LEFs (fakerams) are made of metal layers 1,2,3 and 4 so the router should be able to route over the macros by default (using metal 5), am I right?

Only if the macro cell power grid does not block a significant number of routes on M5, and M5 is a horizontal layer. If not, then that's a good question.

And if routing over the macro is possible then the macro orientation should not be that much an issue, isn't it?

Regardless of if it's possible, if the connectivity is much stronger to the core than the I/O, it is much more beneficial to have the macro flipped in order to reduce wirelength.

lakshmi-sathi commented 2 years ago

Only if the macro cell power grid does not block a significant number of routes on M5, and M5 is a horizontal layer. If not, then that's a good question.

The macro just has vertical metal 4 power stripes for both VDD and VSS (alternate vertical metal 4 stripes), the macro LEF does not use metal 5. Do you think there might be something else that could be preventing the routing from happening over the macros?

Regarding specifying the orientation for the macro. Is this how to specify it in the macro placement config? <macro instance name> R<rotation> <x location> <y location>

rovinski commented 2 years ago

The macro cell grid is the power grid structure created by PDNgen. The macro has stripes on M4, of course, but the PDN has to connect to it by creating a grid on top of that in M5 and connecting with vias. Those straps on M5 will block some routing resources, although a grid of reasonable density would not.

lakshmi-sathi commented 2 years ago

Thanks for pointing that out, that makes sense. I will then try rotating the macros on the left so that the pins are all inwards.

muhammadusman7 commented 2 years ago

@eder-matheus Hello.. I was reading your chat here because i am also getting error of Congestion... I would like to know that how did you open the congestion map? As i am only seeing LEF files in directory but not any MAG file.

eder-matheus commented 2 years ago

@eder-matheus Hello.. I was reading your chat here because i am also getting error of Congestion... I would like to know that how did you open the congestion map? As i am only seeing LEF files in directory but not any MAG file.

Hi, @muhammadusman7. You can run the OpenROAD with the GUI enabled. In the .tar.gz file sent by @lakshmi-sathi, you can add the flag -gui when running openroad. In the GUI, you'll see the "Congestion Map" option, as well as the Congestion Setup. Check the figure below: Screen Shot 2021-10-13 at 09 02 46

maliberty commented 2 years ago

@eder-matheus it remains odd that increasing the halo worsens the congestion. It should be at most neutral. Any thoughts?

eder-matheus commented 2 years ago

@eder-matheus it remains odd that increasing the halo worsens the congestion. It should be at most neutral. Any thoughts?

Maybe the placement gets worse when increasing the halo, resulting in more congestion. AFAIK, we don't have many designs with macros in sky130 platforms, so I don't know how our placement behaves with different macro configurations.

eder-matheus commented 2 years ago

Another comment, not related to the macro halo, but even a commercial tool cannot generate a global routing result without overflow in the test case attached by @lakshmi-sathi.

taylor-bsg commented 2 years ago

@eder-matheus What is the issue in the commercial tool? Are there too many pins on the periphery of the design, too many pins on the RAM, .. or? If M5 is free, then it seems like the ram itself should not present a significant blockage.

eder-matheus commented 2 years ago

@eder-matheus What is the issue with the commercial tool? Are there too many pins on the periphery of the design, too many pins on the RAM, .. or? If M5 is free, then it seems like the ram itself should not present a significant blockage.

@taylor-bsg The global routing ends with congestion, and the detailed routing ends with almost 200 violations (shorts and min area). All the violations occur in the macros, specifically the ones at the bottom of the core area.

lakshmi-sathi commented 2 years ago

@eder-matheus So by min area violations does it hint that there isn't enough spacing between the pins of the macro? (Currently, the spacing between the pins is 1900nm or more, does that sound fine?)

lakshmi-sathi commented 2 years ago

@eder-matheus @rovinski I tried manually specifying the macro locations and orientations via a macro placement config file with the following lines:

bp_fe_pc_gen_1.genblk1_branch_prediction_1.btb_1.btb_mem.macro_mem.mem R180 600 3800
icache_1.data_mem_banks_0__data_mem_bank.macro_mem.mem R180 600 2550
icache_1.data_mem_banks_1__data_mem_bank.macro_mem.mem R180 600 1300
icache_1.data_mem_banks_2__data_mem_bank.macro_mem.mem R180 1150 1300
icache_1.data_mem_banks_3__data_mem_bank.macro_mem.mem R180 1700 1300
icache_1.data_mem_banks_4__data_mem_bank.macro_mem.mem R0 2550 400
icache_1.data_mem_banks_5__data_mem_bank.macro_mem.mem R0 3100 400
icache_1.data_mem_banks_6__data_mem_bank.macro_mem.mem R0 3650 400
icache_1.data_mem_banks_7__data_mem_bank.macro_mem.mem R0 3650 1600
icache_1.metadata_mem.macro_mem.mem R0 3100 2800
icache_1.tag_mem.macro_mem.mem R0 3650 2800

Since the macro pins are all on the left side of the macros, I'm orienting the macros such that the pins are all always facing inwards of the die (macros on the left side are rotated 180 degrees).

There is still an overflow (more than before) and the flow ends with 'Routing congestion too high' error.

[INFO GRT-0105] Maze routing finished.
Final 2D results:
[INFO GRT-0126] Overflow report:
[INFO GRT-0127] Total usage          : 1478467
[INFO GRT-0128] Max H overflow       : 6
[INFO GRT-0129] Max V overflow       : 4
[INFO GRT-0130] Max overflow         : 6
[INFO GRT-0131] Number overflow edges: 496
[INFO GRT-0132] H   overflow         : 140
[INFO GRT-0133] V   overflow         : 401
[INFO GRT-0134] Final overflow       : 541

[INFO GRT-0106] Layer assignment begins.
[INFO GRT-0107] Layer assignment finished.
[INFO GRT-0197] Via related to pin nodes: 166569
[INFO GRT-0198] Via related Steiner nodes: 1088
[INFO GRT-0199] Via filling finished.
[INFO GRT-0111] Final number of vias: 237784
[INFO GRT-0112] Final usage 3D: 2191819
[WARNING GRT-0211] dbGcellGrid already exists in db. Clearing existing dbGCellGrid.
[ERROR GRT-0118] Routing congestion too high.
Error: global_route.tcl, 42 GRT-0118
Command exited with non-zero status 1

I tried taking the congestion map - loaded the lefs and def using 'read_lef' & 'read_def' respectively in the OpenROAD GUI and did the global route using 'global_route' command (am I doing it right?). The congestion map obtained appears better than earlier since there are no red markings around the edge of the macros: image

However, as mentioned, the congestion issue is still present with slightly greater overflow. If the congestion issue was one of not having access to the pins, it's supposed to have gotten solved now when the macro orientations are corrected am I right? What else then might be the cause of this congestion?

I am attaching the packed results: global_route_bp_fe_sky130hd_base_2021-10-19_12-53.tar.gz

lakshmi-sathi commented 2 years ago

Attaching packed result of one more design facing the same error: global_route_bsg_manycore_tile_compute_mesh_real_sky130hd_base_2021-10-25_19-28.tar.gz

This design is using a real SRAM macro instead of fakerams. The real SRAM macro has pins on all sides, unlike the fakerams which have pins only on the left side: image

I took a congestion map of the design at two different densities: 1) when place density not set in the config file: bsg_manycore_tile_compute_mesh

2) place density set to 0.1 bsg_manycore_tile_compute_mesh_lower_density_higher_channel

In both cases, it's facing the same 'Routing congestion too high' error. Since this is a standard SRAM macro, it can't be an issue of pin spacing, could it?

eder-matheus commented 2 years ago

@lakshmi-sathi Sorry for my delay on it. Could you try removing the line set_macro_extension 1 in file platforms/sky130hd/fastroute.tcl? It is creating extra blockages around the macros that makes impossible for the router to access the pins. This workaround works for Issue https://github.com/The-OpenROAD-Project/OpenROAD-flow-scripts/issues/174 too.

eder-matheus commented 2 years ago

@lakshmi-sathi The PR https://github.com/The-OpenROAD-Project/OpenROAD-flow-scripts/pull/220 should fix the issue you're facing when running the flow. Could you try it?

maliberty commented 2 years ago

@lakshmi-sathi have you tried it?

lakshmi-sathi commented 2 years ago

Sorry about the delay. I missed the previous comment. I was collecting together the different designs with which I had faced this issue. I verified this with 4 different designs (includes designs with fakerams, realrams and synthesized rams) that were facing this issue previously and I can confirm that removing set_macro_extension 1 is solving this congestion issue. Could you share what exactly is the purpose of this parameter, is it used to create blockages around the macros for some purpose?

I had tried to push quite a few different designs with sky130 SRAMs through the flow but in almost all cases (all major cases) the flow failed with congestion too high error. Only some smaller designs like tinyRocket and Serv passed the flow before this fix (and that was by adjusting the MACRO_PLACE_CHANNEL). Before the fix, I had tried several different things, like:

I have tested this fix on the BlackParrot front end design too and it passes the flow.

maliberty commented 2 years ago

Macro extension is mostly used on more advanced nodes to avoid routing too close to macros where restrictive rules can make routing difficult. It wasn't used on sky130 until recently we added it due to a poor RAM LEF in the Chameleon design. Once we realized the issue we moved it from the platform level to that specific design. I'm glad things are much better now.

Is there anything further to resolve here or should we close this?

lakshmi-sathi commented 2 years ago

Nothing further to resolve in this issue. I have linked this issue with #240, so that it will get closed automatically.

mousaq92 commented 2 years ago

Describe the bug The flow fails for this BlackParrot front-end design (bp_fe) at the Global Routing step with "Routing congestion too high" error. This issue is despite providing a large core area of 11020x8120um. It is using fakerams generated using bsg_fakeram generator as placeholders for its memories. The same routing congestion issue is noted on another BlackParrot design too which uses fakeram macros like this one. The logs of the run for this design are placed here: https://github.com/bsg-idea/bsg_sky130_designs/tree/test_sky130/designs/bp_fe/logs

Expected behavior Routing succeeding without routing congestion or violations.

Environment

  • OS: Ubuntu 20.04
  • OpenROAD-flow v2.0-880-gd1c7001ad

File Uploads Just have to place this folder in the 'designs/sky130hd/' directory and run it as usual for reproducing the issue: https://github.com/bsg-idea/bsg_sky130_designs/tree/test_sky130/designs/bp_fe

Additional context The congestion issue does not seem to appear on designs without SRAM macros. Different pin widths, pin spacings, and sizes were tried (both large and small) for the fakerams but still, the issue persists. bsg_cache_dma is a design with a fakeram for which the OpenROAD flow was successful.

@tspyrou @taylor-bsg

I have the same issue, except I am not using any macros in my design. Routing congestion too high.

maliberty commented 2 years ago

@mousaq92 that sounds like an unrelated issue then and should have a separate issue with a reproducer.

Ali-Sabir2 commented 2 years ago

Hi! I am facing routing congestion issue. Can anyone help me ? How to solve this problem? Thanks

Ali-Sabir2 commented 2 years ago

Screenshot from 2022-08-16 10-50-58

vijayank88 commented 2 years ago

@Ali-Sabir2 Please open your issue here: https://github.com/The-OpenROAD-Project/OpenLane/issues With issue_reproducible generated through flow or Please package a standalone test case with https://github.com/The-OpenROAD-Project/OpenLane/blob/master/docs/source/using_or_issue.md

From the screenshot its looks like out of memory kill. Please increase your swap space and run the flow again.

vijayank88 commented 1 year ago

@maliberty Is this still relevant?

maliberty commented 1 year ago

This seems done based on "Nothing further to resolve in this issue. ". The other comments seem to be separate issues that should be filed as such.