The-OpenROAD-Project / OpenROAD

OpenROAD's unified application implementing an RTL-to-GDS Flow. Documentation at https://openroad.readthedocs.io/en/latest/
https://theopenroadproject.org/
BSD 3-Clause "New" or "Revised" License
1.5k stars 527 forks source link

[ERROR MPL-0040] Failed on cluster #5655

Open oharboe opened 2 weeks ago

oharboe commented 2 weeks ago

13 minutes to reproduce:

untar https://drive.google.com/file/d/18n0z4_Bk9Gscy3RRCiU6FiNvghIb6zIG/view?usp=drive_link

$ time ./run-me-BoomTile-asap7-base.sh
OpenROAD v2.0-15340-g7ebef4425
Features included (+) or not (-): +Charts +GPU +GUI +Python
This program is licensed under the BSD-3 license. See the LICENSE file for details.
Components of this program may be licensed under more restrictive licenses which must be honored.
HierRTLMP Flow enabled...
rtl_macro_placer -halo_width 20 -halo_height 20 -report_directory .//objects/asap7/BoomTile/base/rtlmp -target_util 0.60
Floorplan Outline: (0.0, 0.0) (2160.73, 2160.73),  Core Outline: (1.026, 1.08) (2159.73, 2159.73)
        Number of std cell instances: 1743858
        Area of std cell instances: 220985.73
        Number of macros: 72
        Area of macros: 691249.12
        Halo width: 20.00
        Halo height: 20.00
        Area of macros with halos: 1292620.62
        Area of std cell instances + Area of macros: 912234.88
        Core area: 4659886.50
        Design Utilization: 0.20
        Core Utilization: 0.06
        Manufacturing Grid: 1

[ERROR MPL-0040] Failed on cluster frontend/bpd/banked_predictors_1/btb
Error: macro_place_util.tcl, 143 MPL-0040

Originally posted by @oharboe in https://github.com/The-OpenROAD-Project/megaboom/issues/97#issuecomment-2308988697

Weather-OS commented 2 weeks ago

@tonywk What is this, wrong issue?

That's a bot. GitHub is going under a surge of bots hosted by certain people from the Russian LUMMA forums. Backed by the government, their goal is apparently to steal as much crypto currency as possible.

maliberty commented 2 weeks ago

Post deleted

maliberty commented 2 weeks ago

Should this be an OR issue or is it related to the setup of megaboom?

oharboe commented 2 weeks ago

Should this be an OR issue or is it related to the setup of megaboom?

Unknown. I have the reproduction case this morning, but I don't know anything about what is going on.

Please advice.

maliberty commented 2 weeks ago

If you want OR developers to look at then it is best to file with OR. We don't track megaboom issues.

oharboe commented 2 weeks ago

@maliberty Can you transfer this issue to OpenROAD? I don't have the access permissions

image

maliberty commented 2 weeks ago

@AcKoucher please give this high priority (a workaround or a solution)

oharboe commented 2 weeks ago

@AcKoucher @maliberty Found a workaround, tweak initial conditions

maliberty commented 2 weeks ago

@AcKoucher @maliberty Found a workaround, tweak initial conditions

From the megaboom PR:

I would like to see an initial diagnosis from @AcKoucher first... but yes. I hope the problem is just some existing rare problem that presents itself with some unfortunate initial conditions and that it can be solved in due course but without urgency.
AcKoucher commented 2 weeks ago

It looks like there's a combination of things that make this somewhat peculiar.

  1. After clustering we end up with multiple mixed clusters which are made of some few std cells and a macro (3, 7, 32, 35, 16, 17, 29).
  1. The dead space filling that we apply to mixed clusters during hierarchical macro placement annealing have meaningful effect in just three clusters (4 --> 7, 8 --> 11, 29 --> 32).
  1. There are way to many tiny std cell clusters.

Now, the actual problem seems to be that when we get to the point of placing the children of the cluster 4 in the first image - 7 in the second image after dead space filling - even with the target util variation SA can't fit the clusters in the outline. Apparently this happens, because the outline penalty never wins the fight against the boundary penalty.

However there's something going on with the wire length, because for all the steps, I see zero at the debug report (perhaps it's too small I have to check).

------ Penalty ------
Area                       1.0186
Outline Penalty            0.4646
Wirelength                 0.0000
Boundary Penalty         102.0848
Normalized Cost           55.1202

My first suggestion would be to try decreasing the halos as @oharboe already did or decrease the boundary penalty. @maliberty It looks like there's a lot going on, do you have some idea of what to aim first?

oharboe commented 2 weeks ago

Thanks! Sounds like this is in good hands and well understood. No longer urgent for my part as we have a workaround.

AcKoucher commented 2 weeks ago

@oharboe Ok :-) I'm investigating what is going on with the clustering so we can have a proper fix.

oharboe commented 2 weeks ago

Another workaround I'm trying out is to save a macro placement. With a saved macro placement, I should avoid rtlmp errors due to slight changes in initial conditions, like changed PLACE_DENSITY.

write_macro_placement macros.tcl
oharboe commented 2 weeks ago

@AcKoucher Please confirm that the fixes work on the full testcase of 1 hour

I included a faster, 13 minute, testcase here, that I produced from the full testcase with deltaDebug.py.

There is a risk that deltaDebug.py identified other bugs than the original bug...

oharboe commented 2 weeks ago

@AcKoucher Please confirm that the fixes work on the full testcase of 1 hour

I included a faster, 13 minute, testcase here, that I produced from the full testcase with deltaDebug.py.

There is a risk that deltaDebug.py identified other bugs than the original bug...

ah, the full test-case still fails...

maliberty commented 2 weeks ago

Can you re-delta?

AcKoucher commented 2 weeks ago

@oharboe As I said in #5666 there are other problems that need to be addressed in other to actually resolve the issue. I'm investigating.

oharboe commented 2 weeks ago

@maliberty New deltadeug: this test case takes ca. 13 minutes and fails on master:

https://drive.google.com/file/d/1klYn7s2_uJBk2Wi-vfPKK_ol8Kwv02sY/view?usp=sharing