Closed stefanottili closed 3 months ago
There is no call to init_floorplan so why would it have worked earlier?
I was referring to the sedsm commands used to run this iccad04 testcase back in the days.
This design has 90+% placement density so it seems unlikely it was ever routable.
I don't know what sedsm commands
are but I don't think we've ever had this in our regression
sedsm stands for "silicon ensemble deep sub micron", the name of cadence's P&R tool at the time. The iccad04 testcase contains a script that run's qplace and wroute.
This is a 6 layer testcase where the ram's are only blocked up to M3, and there is no power routing in the diearea.
Yes, density looks high by today's standards, but back then this must have been routable. It was part of the iccad04 testcases. I didn't make this thing up.
There is a huge discrepancy between RUDY and global route heat maps. The ram placement could very likely be improved upon, macro_placement needs to use halo's.
There is a general question whether heat map's should display the usage of "overall" routing resources or "available" routing resources. When you have ram's blocking M1/M2/M3 but M4/M5/M6 are available, there are 2 layers less resources then in the stdcell areas, but if M4/M5/M6 routing resources are only used sparingly, the ram areas shouldn't show up in dark red, no ?
Using the OpenRoad gui to visualize the routing tracks, it seems as if metal2/metal3 tracks are missing.
Looking at def/risc2.def.gz, they're there. Maybe OpenRoad doesn't read this "old" notation ?
285 TRACKS Y -479400 DO 2399 STEP 400 LAYER metal1 metal2 ;
286 TRACKS Y -479400 DO 2399 STEP 400 LAYER metal2 metal3 metal4 ;
287 TRACKS Y -479400 DO 1200 STEP 800 LAYER metal4 metal5 metal6 ;
288 TRACKS X -479400 DO 2399 STEP 400 LAYER metal1 metal2 metal3 ;
289 TRACKS X -479400 DO 2399 STEP 400 LAYER metal3 metal4 metal5 ;
290 TRACKS X -479400 DO 1200 STEP 800 LAYER metal5 metal6 ;
291 GCELLGRID Y -479600 DO 240 STEP 4000 ;
292 GCELLGRID Y 480000 DO 1 STEP 0 ;
293 GCELLGRID X -479600 DO 240 STEP 4000 ;
294 GCELLGRID X 480000 DO 1 STEP 0 ;
I stand corrected.
a) OpenRoad reads the tracks just fine. I just didn't zoom in far enough to see them. The M2/M3 are not visible unless I zoom in to less than the height of a single stdcell whereas M1/M4/M5/M6 are brighter and visible at larger zoom factors.
b) The 4 large ram's are blocked up to M5, so only vertical M6 routing is available over them. Which means that the RUDY map looks as if it might correctly show the usage of available resources. But the discrepancy between RUDY and groute map is still suspicious to me.
And lastly, the sedsm script reads lef/def, does macro placement, stdcell placement, routing and defout.
global_route --allow_congestion
yupp, if global_route is correct, > 150% on all layers, this won't route at all.
But then why does RUDY shows the stdcell area's as being routable ?
[WARNING GRT-0300] Timing is not available, setting critical nets percentage to 0.
[INFO GRT-0020] Min routing layer: metal1
[INFO GRT-0021] Max routing layer: metal6
[INFO GRT-0022] Global adjustment: 0%
[INFO GRT-0023] Grid origin: (0, 0)
[INFO GRT-0043] No OR_DEFAULT vias defined.
[INFO GRT-0088] Layer metal1 Track-Pitch = 0.4000 line-2-Via Pitch: 2.1200
[INFO GRT-0088] Layer metal2 Track-Pitch = 0.4000 line-2-Via Pitch: 2.1200
[INFO GRT-0088] Layer metal3 Track-Pitch = 0.4000 line-2-Via Pitch: 2.1200
[INFO GRT-0088] Layer metal4 Track-Pitch = 0.4000 line-2-Via Pitch: 2.2200
[INFO GRT-0088] Layer metal5 Track-Pitch = 0.8000 line-2-Via Pitch: 2.3200
[INFO GRT-0088] Layer metal6 Track-Pitch = 0.8000 line-2-Via Pitch: 2.3200
[INFO GRT-0019] Found 0 clock nets.
[INFO GRT-0001] Minimum degree: 2
[INFO GRT-0002] Maximum degree: 6963
[INFO GRT-0003] Macros: 7
[INFO GRT-0043] No OR_DEFAULT vias defined.
[INFO GRT-0004] Blockages: 776492
[INFO GRT-0053] Routing resources analysis:
Routing Original Derated Resource
Layer Direction Resources Resources Reduction (%)
---------------------------------------------------------------
metal1 Horizontal 54855 5081 90.74%
metal2 Vertical 54855 34481 37.14%
metal3 Horizontal 54855 34710 36.72%
metal4 Vertical 54855 35947 34.47%
metal5 Horizontal 52470 35388 32.56%
metal6 Vertical 52470 35170 32.97%
---------------------------------------------------------------
[INFO GRT-0101] Running extra iterations to remove overflow.
[WARNING GRT-0170] Net lx1/lbc1/LBC1/data/n85: Invalid index for position (-362600, 129400). Net degree: 104.
[WARNING GRT-0153] Net lx1/lbc1/LBC1/data/n85 has errors during updateRouteType2.
[INFO GRT-0103] Extra Run for hard benchmark.
[INFO GRT-0197] Via related to pin nodes: 331871
[INFO GRT-0198] Via related Steiner nodes: 17164
[INFO GRT-0199] Via filling finished.
[INFO GRT-0111] Final number of vias: 511783
[INFO GRT-0112] Final usage 3D: 1886470
[WARNING GRT-0115] Global routing finished with overflow.
[INFO GRT-0096] Final congestion report:
Layer Resource Demand Usage (%) Max H / Max V / Total Overflow
---------------------------------------------------------------------------------------
metal1 5081 53033 1043.75% 12 / 5 / 49443
metal2 34481 84001 243.62% 4 / 12 / 51518
metal3 34710 52834 152.22% 10 / 1 / 19799
metal4 35947 57522 160.02% 2 / 6 / 23561
metal5 35388 55738 157.51% 7 / 2 / 21266
metal6 35170 47993 136.46% 2 / 6 / 14417
---------------------------------------------------------------------------------------
Total 180777 351121 194.23% 37 / 32 / 180004
[INFO GRT-0018] Total wirelength: 3024996 um
[INFO GRT-0014] Routed nets: 33760
Let's start at the beginning. If I just load the lef & def provided I see things are already place somewhat strangely:
Is this intended? If the design is already placed, why run placement again?
The web pages with regard to the 2004 iccad contest seem to be lost in time.
All I can say is that this testcase came with lef/def and a sedsm command file. No result/log files, so I can only assume that the diearea/pin placement worked.
More than 90% utilization for a 9 track M1 rail stdcell lib with 6 layers of metal is definitely pushing what’s doable, or just not possible. In that case the def is bogus and one needs a different diesize.
The def stdcell placement is obviously illegal and the default macro placement clearly not good either. An mpl placement puts the large M5 OBS ram on the right edge, but they need a halo as to not block the M3 IO access and avoid stdcells underneath their power ring. And a lot of spacing to allow access to the pins on their bottom.
My main question is why even with bad macro placement RUDY is looking routable, but completely different from the groute map, which is showing that the placement is completely unroutable.
The sedsm script does run
read lef
read def
automatic macro placement
stdcell placement
routing
defout
Where did you get this data?
I agree the rudy diff needs looking at.
I probably downloaded the data at the time of the contest.
Hi Augusto,
I had to modify lef/risc2.lef.gz, changing the 3 ram's CLASS RING to CLASS BLOCK.
15531: CLASS BLOCK ;
19167: CLASS BLOCK ;
21165: CLASS BLOCK ;
The first pictures are from "macro_placement", which moved the ram's to the right.
global_placement will move the ram's too, so I've changed their status to + FIXED in def/risc2.def.gz to keep their original placement. This is shown with the second set of pictures.
32912:- ICACHE_INST0/SRAM tsyncram_512x32 + FIXED ( -304400 148000 ) N ;
32913:- ICACHE_TAG0/SRAM rf_128x22 + FIXED ( -119240 -42800 ) N ;
32914:- DCACHE_DATA/SRAM tsyncram_512x32 + FIXED ( -169000 -467600 ) N ;
32915:- DCACHE_TAG/SRAM rf_128x22 + FIXED ( -15720 -42800 ) N ;
32916:- DRAM_DATA/SRAM tsyncram_512x32 + FIXED ( 87800 148000 ) N ;
32917:- IRAM_DATA/SRAM tsyncram_512x32 + FIXED ( 87800 -39200 ) N ;
32918:- IRAM_VALID/SRAM rf_8x32 + FIXED ( -201220 32800 ) N ;
read_lef lef/risc2.lef.gz
read_def def/risc2.def.gz
#rtl_macro_placer
#macro_placement
global_placement -density 0.95
#detailed_placement
#global_route -verbose
#detailed_route
I started detailed route 10 hours ago with the + FIXED original ram placement and it's now at the 34th iteration with 1024 mostly Metal4 violations. It might be that RUDY is more indicative of the routeability of this design than groute.
Is there any way to interrupt detailed route and keep the routing/error markers for inspection ?
If not this leads to two more feature requests. I'll file them if you agree with them ;-) 1) update the view after each detailed route iteration with error markers to show the progress 2) create a big "stop" button in the gui to be able to stop detailed routing and keep the progress made so far.
With the original + PLACED macro's I've also tried a) rtl_macro_placer, which coredumps (likely due to missing verilog netlist) and b) macro_placement, which places all rams to the right (first two pics).
It looks to me as if global_placement might actually provide the best hint at macro placement, even though it placed two of the large rams at the bottom with the pins facing down, should flip the ram to get pins to the top. Well this failed with: [ERROR DRT-0416] Term A[2] contains offgrid pin shape. Pin shape ( 118121 -468168 ) ( 118721 -467568 ) is not a multiple of the manufacturing grid 10. [ERROR GUI-0070] DRT-0416
detailed_route has -drc_report_iter_step to report periodically. ORFS sets it to 5 by default. Stop is the GUI is something I've planned to do for a long time so feel free to open an issue.
Well, the groute map is completely wrong, the rudy map looks valid. Looking at where detailed route will fail, I limited the # of iteration to 5,
Looking at the errors after the 5th iteration, groute rather goes thru M4 OBS both with horizontal and vertical routing then using the available M6 (the large ram's have OBS from M1-M5).
If the global router see's everything as equally congested, it has limited choices of what to do.
1) The global placer must have some notion of routing congestion, is there any way to visualize that ? 2) Is there a way to increase the macro OBS violation cost for groute ?
Keep in mind that after ~10 hours of detailed route, the # of M4 shorts was down to ~1000.
global_route -allow_congestion
detailed_route -drc_report_iter_step 1 -droute_end_iter 5
[INFO GRT-0096] Final congestion report:
Layer Resource Demand Usage (%) Max H / Max V / Total Overflow
---------------------------------------------------------------------------------------
metal1 4954 61139 1234.13% 15 / 4 / 57518
metal2 34480 92459 268.15% 4 / 12 / 59581
metal3 34704 53684 154.69% 8 / 2 / 20594
metal4 35947 59531 165.61% 2 / 10 / 25335
metal5 35388 57378 162.14% 7 / 2 / 22762
metal6 35170 48978 139.26% 2 / 5 / 15318
---------------------------------------------------------------------------------------
Total 180643 373169 206.58% 38 / 35 / 201108
[INFO DRT-0195] Start 5th optimization iteration.
...
[INFO DRT-0199] Number of violations = 33443.
Viol/Layer metal1 V1 metal2 V2 metal3 V3 metal4 VL metal5 VQ metal6
Cut Spacing 0 223 0 38 0 13 0 22 0 0 0
Metal Spacing 15 0 443 0 35 0 71 0 15 0 13
NS Metal 0 0 0 0 1 0 4 0 0 0 0
Recheck 0 0 0 0 0 0 10 0 0 0 6
Short 106 1 1424 8 320 3 29499 98 329 10 729
SpacingRange 0 0 2 0 2 0 0 0 3 0 0
[INFO DRT-0267] cpu time = 00:26:09, elapsed time = 00:05:30, memory = 4918.47 (MB), peak = 4998.53 (MB)
Total wire length = 2334478 um.
Total wire length on LAYER metal1 = 33168 um.
Total wire length on LAYER metal2 = 546802 um.
Total wire length on LAYER metal3 = 671212 um.
Total wire length on LAYER metal4 = 592058 um.
Total wire length on LAYER metal5 = 269319 um.
Total wire length on LAYER metal6 = 221917 um.
Total number of vias = 367893.
This starts with a bad macro placement. Which in turn gives the global placement problems. There doesn't seem to be a way to visualize it's view of congestion. The global router congestion calculation seems off, especially with regard to large rams obstructing M1-M5. And the detailed router then gives it all to fix the mess the previous steps caused.
Ok, ... let's take the macro's out of the equation. (It's still a good mpl/global_placement testcase).
This way it’s a mostly "why is the groute map off by so much" ?
Let's also move the metal2 pins to metal4, since the global placement puts stdcells right beside the pins and this can't be routed later. (I still would want gpl's view of congestion to be visualized, a "placement congestion routing map")
[ERROR GRT-0118] Routing congestion too high. Check the congestion heatmap in the GUI.
[ERROR GUI-0070] Error: risc2.or, 9 GRT-0118
Rudy doesn't seem to correctly display congestion at the top three rows and on the right at a similar distance.
[INFO DRT-0195] Start 4th optimization iteration.
Completing 10% with 275 violations.
...
[INFO DRT-0199] Number of violations = 160.
Viol/Layer V1 metal2
Cut Spacing 159 0
Short 0 1
it took detailed route another 30 iterations to get rid of these V1 cut spacing errors, half of the time iterating on the remaining 8 violations. Better pin access should be able to avoid all these iterations.
So groute map is wrong, without macros this should be easily routable.
Hi @stefanottili, I have been examining the internal rudy variables used during calculations for your first test case, with macros, and I couldn't find nothing unexpected.
This is the test cases I was having a look at:
What do you mean exactly with "placement congestion routing map"? Do you want the exact routing congestion during placement stage? I understand that the idea is to have RUDY estimate it, so we do not have to run the costly grt (as we used to previously), and not have an exact routing congestion. You should be able to do so using "routability_use_grt" parameter during global_placement, and using grt instead of rudy during routability in gpl will enable the exact routing congestion by grt.
On your new test case without macros you said: "Rudy doesn't seem to correctly display congestion at the top three rows and on the right at a similar distance." Do you mean the region close to the pins? If you could please circle the region with an image editor (gimp), to be clear, I would really appreciate. Either way, this might be only due to RUDY's expected imprecision.
The difference between grt and rudy still look awkward. @eder-matheus, do you think this might be because of the die coordinates origin with negative values, from #5284 ? I see nothing extraordinary during rudy calculations.
The RUDY map has a gap of 2 "blocks" on the right/top, whereas on left/bottom it goes right up to the die area.
The RUDY map has a gap of 2 "blocks" on the right/top, whereas on left/bottom it goes right up to the die area.
The last line/row of the grid is greater than the other ones for die areas that aren't multiples of the grid size. So the gap you see is because these regions have more resources than the others, leading to less congestion.
Hmm, something doesn't quite look right to me.
Next to the pins on the top there is a pretty uniform stdcell density and you have the additional resource requirement of the metal4 pin connection, but the rudy map goes from a rows of “green, yellow, orange and red” congestion indicators to a by the looks of it two row hight of “no congestion to see here at all”.
DIEAREA ( -479600 -479600 ) ( 480000 480000 ) ; GCELLGRID Y -479600 DO 240 STEP 4000 ;
240 * 4000 - 479600 = 480400, so the last Y GCELL is slightly too tall for the top diearea. Is it coorect to assume that there GCELL's on top is of size 4000x7600 ?
So this box is bigger then the one below, but it will have "the same density of routing requirements"
Is there a way to dump the rudy box + it's density info into a text file to be able check the coordinates and requirements ?
if anything should be red then it should be the area right below the pins on the top.
Way more routing in that are then the red area's right below.
Hmm, something doesn't quite look right to me.
Next to the pins on the top there is a pretty uniform stdcell density and you have the additional resource requirement of the metal4 pin connection, but the rudy map goes from a rows of “green, yellow, orange and red” congestion indicators to a by the looks of it two row hight of “no congestion to see here at all”.
DIEAREA ( -479600 -479600 ) ( 480000 480000 ) ; GCELLGRID Y -479600 DO 240 STEP 4000 ;
240 * 4000 - 479600 = 480400, so the last Y GCELL is slightly too tall for the top diearea. Is it coorect to assume that there GCELL's on top is of size 4000x7600 ?
So this box is bigger then the one below, but it will have "the same density of routing requirements"
Is there a way to dump the rudy box + it's density info into a text file to be able check the coordinates and requirements ?
I understand your point, and I think RUDY might have a bug. You can check the density value in the GUI; double-click on the "Estimated Congestion (RUDY)" option and it will show some settings. "Show Numbers" will draw the density values in each grid tile, and in these larger tiles, the density is zero. Some further investigation of the RUDY code will be required to understand why this is happening.
Here's a figure with the density data. The zeros there are definitely wrong.
Thanks for confirming.
By the looks of it, groute suffers from something similar.
And now one more datapoint with regard to rudy/groute maps:
I've saved the routed.def, started a new openroad session, read_lef/read_def routed.def/global_route and then display both groute and rudy maps. They look very different from the groute/rudy maps at the end of routing ...
and they also look very different when reading routed.db and displaying rudy/groute map.
and for the last experiment of the day, I've moved the centered floorpan to have 0,0 at the lower left. risc2.first_quadrant.noram.def.gz
1) the rudy map gaps on the right/top don't look so pronounced any more but are still clearly visible 2) the weird vertical rudy congestion below the top pins is gone 3) Global route spreads out the routing much more, but it's congestion map is still dark red 4) Detail route still wastes 30 iteration on fixing V1 cut spacing errors
a) Both global route and rudy behave differently when the floorpan is centered vs. the left/bottom is at 0,0 b) Both rudy and global route map violently disagree about the amount of congestion.
risc2.first_quadrant.noram.def.gz took out the issue of macro placement + 95% utilization and the issue of OR not handling centered floorplans.
As of the 10th Aug 2024, running the RISC2 FARADAY_ICCAD04Bench led to a couple of bug fixes, but the main problems remain.
This TSMC 180 4+2 layer testcase routes in 10 iterations, despite the GRT's resources computation being completely off and showing an unroutable design.
ispd24 with a metal stack using different layer pitches doesn't show this behavior, so why is OR grt struggling with this tech ?
131 [INFO GRT-0088] Layer metal1 Track-Pitch = 0.4000 line-2-Via Pitch: 2.1200
132 [INFO GRT-0088] Layer metal2 Track-Pitch = 0.4000 line-2-Via Pitch: 2.1200
133 [INFO GRT-0088] Layer metal3 Track-Pitch = 0.4000 line-2-Via Pitch: 2.1200
134 [INFO GRT-0088] Layer metal4 Track-Pitch = 0.4000 line-2-Via Pitch: 2.2200
135 [INFO GRT-0088] Layer metal5 Track-Pitch = 0.8000 line-2-Via Pitch: 2.3200
136 [INFO GRT-0088] Layer metal6 Track-Pitch = 0.8000 line-2-Via Pitch: 2.3200
Using only 5 layer, placement densitiy to 0.8 and reducing the routing resources by 30% routes this design much quicker in just 6 drt iterations. The global route resource map remains off.
164 [INFO GRT-0096] Final congestion report:
165 Layer Resource Demand Usage (%) Max H / Max V / Total Overflow
166 ---------------------------------------------------------------------------------------
167 metal1 10338 60369 583.95% 12 / 2 / 54780
168 metal2 28124 96723 343.92% 2 / 13 / 71862
169 metal3 28124 36713 130.54% 6 / 2 / 12886
170 metal4 28124 41323 146.93% 2 / 6 / 16726
171 metal5 26386 31055 117.69% 4 / 1 / 8855
172 ---------------------------------------------------------------------------------------
173 Total 121096 266183 219.81% 26 / 24 / 165109
read_lef lef/risc2.lef.gz
#read_def def/risc2.def.gz
read_def def/risc2.first_quadrant.noram.def.gz
set_routing_layers -signal metal1-metal5
set_global_routing_layer_adjustment * 0.3
#macro_placement
#global_placement -density 0.95
global_placement -density 0.8
detailed_placement
global_route -allow_congestion -verbose
detailed_route -drc_report_iter_step 1
The centered die use case is a very low priority as OR never generates such.
this is not about the centered case any more, the last pictures/groute overflow numbers are with "...moved the centered floorpan to have 0,0 at the lower left" DIEAREA ( 0 0 ) ( 959600 959600 ) ;
It's hard to follow the new problem with many comments here. Could you attach a new reproducible with all the files and scripts needed to run the problem? If the problem is not about the centered case, perhaps creating a separate issue would make sense.
see https://github.com/The-OpenROAD-Project/OpenROAD/issues/5548 for the "first_quadrant.noram" testcase.
closed and replaced by https://github.com/The-OpenROAD-Project/OpenROAD/issues/5557
Describe the bug
More fallout from https://github.com/The-OpenROAD-Project/OpenROAD/issues/5284
The RUDY map looks different then the global_route map.
And then there are ERROR's from detailed placement and manually started global_route ...
I'm assuming that this testcase was P&R'able in 2004, when init floorpan would always center the diearea.
Expected Behavior
matching rudy and global route map
Environment
To Reproduce
Relevant log output
No response
Screenshots
No response
Additional Context
No response