The-OpenROAD-Project / OpenROAD-flow-scripts

OpenROAD's scripts implementing an RTL-to-GDS Flow. Documentation at https://openroad-flow-scripts.readthedocs.io/en/latest/
https://theopenroadproject.org/
Other
282 stars 262 forks source link

OOM in Global placement asap7/aes-block #2027

Closed luarss closed 2 weeks ago

luarss commented 1 month ago

Subject

[Stage]: Global Placement.

Describe the bug

I was running Autotuner with the following parameters for design aes-block and it ran out of memory.

{
  "CELL_PAD_IN_SITES_DETAIL_PLACEMENT": 3,
  "CELL_PAD_IN_SITES_GLOBAL_PLACEMENT": 2,
  "CORE_ASPECT_RATIO": 0.8471446033249062,
  "CORE_MARGIN": 2,
  "CORE_UTILIZATION": 1,
  "CTS_CLUSTER_DIAMETER": 160,
  "CTS_CLUSTER_SIZE": 183,
  "PLACE_DENSITY_LB_ADDON": 0.0800363890832451,
  "_FR_FILE_PATH": "",
  "_FR_GR_OVERFLOW": 1,
  "_FR_LAYER_ADJUST": 0.05555296235777712,
  "_PINS_DISTANCE": 1,
  "_SDC_CLK_PERIOD": 573.0071762034466
}

Expected Behavior

Should consistently work for both.

Environment

Note this is a Docker build. OR Commit a515fc6cc97a7092efd51a28c1414e2fb4e53413

Unknown git commit, this is not a git repository.

Please make sure that you have the latest code changes and add the commit
hash in the description.

kernel: Linux 6.5.0-1020-gcp
os: Ubuntu 22.04.4 LTS (Jammy Jellyfish)
cmake version 3.24.2
CMake Warning at CMakeLists.txt:98 (message):
  OpenROAD git describe failed, using sha1 instead

-- The CXX compiler identification is GNU 11.4.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- OpenROAD version: a515fc6cc97a7092efd51a28c1414e2fb4e53413
-- System name: Linux
-- Compiler: GNU 11.4.0
-- Build type: RELEASE
-- Install prefix: /usr/local
-- C++ Standard: 17
-- C++ Standard Required: ON
-- C++ Extensions: OFF
-- The C compiler identification is GNU 11.4.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Found Python: /usr/bin/python3.10 (found version "3.10.12") found components: Interpreter 
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- Performing Test C_COMPILER_SUPPORTS__-Wall
-- Performing Test C_COMPILER_SUPPORTS__-Wall - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wall
-- Performing Test CXX_COMPILER_SUPPORTS__-Wall - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-array-bounds
-- Performing Test C_COMPILER_SUPPORTS__-Wno-array-bounds - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-array-bounds
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-array-bounds - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-nonnull
-- Performing Test C_COMPILER_SUPPORTS__-Wno-nonnull - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-nonnull
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-nonnull - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-maybe-uninitialized
-- Performing Test C_COMPILER_SUPPORTS__-Wno-maybe-uninitialized - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-maybe-uninitialized
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-maybe-uninitialized - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-format-overflow
-- Performing Test C_COMPILER_SUPPORTS__-Wno-format-overflow - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-format-overflow
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-format-overflow - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-unused-variable
-- Performing Test C_COMPILER_SUPPORTS__-Wno-unused-variable - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-unused-variable
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-unused-variable - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-unused-function
-- Performing Test C_COMPILER_SUPPORTS__-Wno-unused-function - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-unused-function
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-unused-function - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-write-strings
-- Performing Test C_COMPILER_SUPPORTS__-Wno-write-strings - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-write-strings
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-write-strings - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-sign-compare
-- Performing Test C_COMPILER_SUPPORTS__-Wno-sign-compare - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-sign-compare
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-sign-compare - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-deprecated
-- Performing Test C_COMPILER_SUPPORTS__-Wno-deprecated - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-deprecated
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-deprecated - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-c++11-narrowing
-- Performing Test C_COMPILER_SUPPORTS__-Wno-c++11-narrowing - Failed
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-c++11-narrowing
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-c++11-narrowing - Failed
-- Performing Test C_COMPILER_SUPPORTS__-Wno-register
-- Performing Test C_COMPILER_SUPPORTS__-Wno-register - Failed
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-register
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-register - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-format
-- Performing Test C_COMPILER_SUPPORTS__-Wno-format - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-format
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-format - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-reserved-user-defined-literal
-- Performing Test C_COMPILER_SUPPORTS__-Wno-reserved-user-defined-literal - Failed
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-reserved-user-defined-literal
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-reserved-user-defined-literal - Failed
-- Performing Test C_COMPILER_SUPPORTS__-fpermissive
-- Performing Test C_COMPILER_SUPPORTS__-fpermissive - Failed
-- Performing Test CXX_COMPILER_SUPPORTS__-fpermissive
-- Performing Test CXX_COMPILER_SUPPORTS__-fpermissive - Success
-- Performing Test C_COMPILER_SUPPORTS__-x
-- Performing Test C_COMPILER_SUPPORTS__-x - Failed
-- Performing Test CXX_COMPILER_SUPPORTS__-x
-- Performing Test CXX_COMPILER_SUPPORTS__-x - Failed
-- Performing Test C_COMPILER_SUPPORTS__c++
-- Performing Test C_COMPILER_SUPPORTS__c++ - Failed
-- Performing Test CXX_COMPILER_SUPPORTS__c++
-- Performing Test CXX_COMPILER_SUPPORTS__c++ - Failed
-- Performing Test C_COMPILER_SUPPORTS__-std=c++17
-- Performing Test C_COMPILER_SUPPORTS__-std=c++17 - Failed
-- Performing Test CXX_COMPILER_SUPPORTS__-std=c++17
-- Performing Test CXX_COMPILER_SUPPORTS__-std=c++17 - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-unused-but-set-variable
-- Performing Test C_COMPILER_SUPPORTS__-Wno-unused-but-set-variable - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-unused-but-set-variable
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-unused-but-set-variable - Success
-- TCL library: /usr/lib/x86_64-linux-gnu/libtcl.so
-- TCL header: /usr/include/tcl/tcl.h
-- TCL readline library: /usr/lib/x86_64-linux-gnu/libtclreadline.so
-- TCL readline header: /usr/include/x86_64-linux-gnu
-- Found SWIG: /usr/local/bin/swig (found suitable version "4.1.0", minimum required is "4.0")  
-- Using SWIG >= 4.1.0 -flatstaticmethod flag for python
-- Found Boost: /usr/local/lib/cmake/Boost-1.80.0/BoostConfig.cmake (found version "1.80.0")  
-- boost: 1.80.0
-- Found Python3: /usr/include/python3.10 (found version "3.10.12") found components: Development Development.Module Development.Embed 
-- Found ZLIB: /usr/lib/x86_64-linux-gnu/libz.so (found version "1.2.11") 
-- spdlog: 1.8.1
-- Found BISON: /usr/bin/bison (found version "3.8.2") 
-- Could NOT find Doxygen (missing: DOXYGEN_EXECUTABLE) 
-- STA version: 2.5.0
-- STA git sha: ee8d3d0fa23bfbc69f3e936ff884c3d30f5bfb59
-- System name: Linux
-- Compiler: GNU 11.4.0
-- Build type: RELEASE
-- Build CXX_FLAGS: -O3 -DNDEBUG
-- Install prefix: /usr/local
-- Found FLEX: /usr/bin/flex (found version "2.6.4") 
-- TCL library: /usr/lib/x86_64-linux-gnu/libtcl.so
-- TCL header: /usr/include/tcl/tcl.h
-- SSTA: 0
-- Found SWIG: /usr/local/bin/swig (found suitable version "4.1.0", minimum required is "3.0")  
-- STA executable: /workspace/tools/OpenROAD/src/sta/app/sta
-- Found re2: /opt/or-tools/lib/cmake/re2/re2Config.cmake (found version "9.0.0") 
-- Found Clp: /opt/or-tools/lib/cmake/Clp/ClpConfig.cmake (found version "1.17.7") 
-- Found Cbc: /opt/or-tools/lib/cmake/Cbc/CbcConfig.cmake (found version "2.10.7") 
-- Found SCIP: /opt/or-tools/lib/cmake/scip/scip-config.cmake (found version "8.0.1") 
-- Found OpenMP_CXX: -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
-- GPU is not enabled
-- TCL library: /usr/lib/x86_64-linux-gnu/libtcl.so
-- TCL header: /usr/include/tcl/tcl.h
-- GUI is enabled
-- Charts widget is enabled
-- Found Boost: /usr/local/lib/cmake/Boost-1.80.0/BoostConfig.cmake (found version "1.80.0") found components: serialization 
-- Could NOT find VTune (missing: VTune_LIBRARIES VTune_INCLUDE_DIRS) 
-- Found Boost: /usr/local/lib/cmake/Boost-1.80.0/BoostConfig.cmake (found suitable version "1.80.0", minimum required is "1.78")  
-- TCL library: /usr/lib/x86_64-linux-gnu/libtcl.so
-- TCL header: /usr/include/tcl/tcl.h
-- Found Boost: /usr/local/lib/cmake/Boost-1.80.0/BoostConfig.cmake (found version "1.80.0") found components: seriaNumber of processor cores: 16
lization system thread 
-- Found Boost: /usr/local/lib/cmake/Boost-1.80.0/BoostConfig.cmake (found version "1.80.0")  
-- Found Eigen3: /usr/local/share/eigen3/cmake/Eigen3Config.cmake (found version "3.4.1") 
-- TCL readline enabled
-- Tcl Extended disabled
-- Python3 enabled
-- Configuring done
-- Generating done
-- Build files have been written to: /tmp/tmp.i5S6Dtqumu

To Reproduce

Method 1: Run with original flow

Create fastroute.tcl

set_global_routing_layer_adjustment $::env(MIN_ROUTING_LAYER)-$::env(MAX_ROUTING_LAYER) 0.05555296235777712

Replace clk period in constraint2.sdc

set clk_period 573.0071762034466
cd flow
make DESIGN_CONFIG=./designs/asap7/aes-block/config.mk CELL_PAD_IN_SITES_DETAIL_PLACEMENT=3 CELL_PAD_IN_SITES_GLOBAL_PLACEMENT=2 CORE_ASPECT_RATIO=0.8471446033249062 CORE_MARGIN=2 CORE_UTILIZATION=1 CTS_CLUSTER_DIAMETER=160 CTS_CLUSTER_SIZE=183 PLACE_DENSITY_LB_ADDON=0.0800363890832451 PLACE_PIN_ARGS="-min_distance 1"  SDC_FILE=./designs/asap7/aes-block/constraint2.sdc FASTROUTE_TCL=./designs/asap7/aes-block/fastroute.tcl 

Method 2: Run with autotuner flow.

Edit ./flow/designs/asap7/aes-block/autotuner.json

{
    "_SDC_FILE_PATH": "constraint.sdc",
    "_SDC_CLK_PERIOD": {
        "type": "float",
        "minmax": [
            573.00718,
            573.00718
        ],
        "step": 0
    },
    "CORE_UTILIZATION": {
        "type": "int",
        "minmax": [
            1,
            1
        ],
        "step": 0
    },
    "CORE_ASPECT_RATIO": {
        "type": "float",
        "minmax": [
            0.84714,
            0.84714
        ],
        "step": 0
    },
    "CORE_MARGIN": {
        "type": "int",
        "minmax": [
            2,
            2
        ],
        "step": 0
    },
    "CELL_PAD_IN_SITES_GLOBAL_PLACEMENT": {
        "type": "int",
        "minmax": [
            3,
            3
        ],
        "step": 0
    },
    "CELL_PAD_IN_SITES_DETAIL_PLACEMENT": {
        "type": "int",
        "minmax": [
            2,
            2
        ],
        "step": 0
    },
    "_FR_LAYER_ADJUST": {
        "type": "float",
        "minmax": [
            0.055553,
            0.055553
        ],
        "step": 0
    },  
    "PLACE_DENSITY_LB_ADDON": {
        "type": "float",
        "minmax": [
            0.080036,
            0.080036
        ],
        "step": 0
    },
    "CTS_CLUSTER_SIZE": {
        "type": "int",
        "minmax": [
            160,
            160
        ],
        "step": 0
    },
    "CTS_CLUSTER_DIAMETER": {
        "type": "int",
        "minmax": [
            183,
            183
        ],
        "step": 0
    },
    "_PINS_DISTANCE": {
        "type": "int",
        "minmax": [
            1,
            1
        ],
        "step": 0
    },
    "_FR_FILE_PATH": ""
}
cd tools/Autotuner/src/autotuner
python3 distributed.py --design aes-block --platform asap7 --config ../../../../flow/designs/asap7/aes-block/autotuner.json tune --samples 1

Relevant log output

Using base flow it completes to final stage without errors.

Using autotuner flow, it runs OOM at 3_3_place_gp.

Here is the log for GP:

OpenROAD a515fc6cc97a7092efd51a28c1414e2fb4e53413 
Features included (+) or not (-): +Charts +GPU +GUI +Python
This program is licensed under the BSD-3 license. See the LICENSE file for details.
Components of this program may be licensed under more restrictive licenses which must be honored.
[INFO GPL-0002] DBU: 1000
[INFO GPL-0003] SiteSize: (  0.054  0.270 ) um
[INFO GPL-0004] CoreBBox: (  2.052  2.160 ) ( 4087.368 3462.750 ) um
[INFO GPL-0006] NumInstances:           1096820
[INFO GPL-0007] NumPlaceInstances:         3773
[INFO GPL-0008] NumFixedInstances:      1071204
[INFO GPL-0009] NumDummyInstances:        21843
[INFO GPL-0010] NumNets:                   4224
[INFO GPL-0011] NumPins:                  11992
[INFO GPL-0012] DieBBox:  (  0.000  0.000 ) ( 4089.402 3464.907 ) um
[INFO GPL-0013] CoreBBox: (  2.052  2.160 ) ( 4087.368 3462.750 ) um
[INFO GPL-0016] CoreArea:            14137603.696 um^2
[INFO GPL-0017] NonPlaceInstsArea:   210016.050 um^2
[INFO GPL-0018] PlaceInstsArea:         808.461 um^2
[INFO GPL-0019] Util:                     0.006 %
[INFO GPL-0020] StdInstsArea:           808.461 um^2
[INFO GPL-0021] MacroInstsArea:           0.000 um^2
[INFO GPL-0031] FillerInit:NumGCells:  66473702
[INFO GPL-0032] FillerInit:NumGNets:       4224
[INFO GPL-0033] FillerInit:NumGPins:      11992
[INFO GPL-0023] TargetDensity:            1.000
[INFO GPL-0024] AvrgPlaceInstArea:        0.214 um^2
[INFO GPL-0025] IdealBinArea:             0.214 um^2
[INFO GPL-0026] IdealBinCnt:           65978782
[INFO GPL-0027] TotalBinArea:        14137603.696 um^2
[INFO GPL-0028] BinCnt:      2048   2048
[INFO GPL-0029] BinSize: (  1.995  1.690 )
[INFO GPL-0030] NumBins: 4194304
global_placement -density 0.09008940156576627 -pad_left 3 -pad_right 3 -routability_driven -timing_driven
[INFO GPL-0002] DBU: 1000
[INFO GPL-0003] SiteSize: (  0.054  0.270 ) um
[INFO GPL-0004] CoreBBox: (  2.052  2.160 ) ( 4087.368 3462.750 ) um
[INFO GPL-0006] NumInstances:           1096820
[INFO GPL-0007] NumPlaceInstances:         3773
[INFO GPL-0008] NumFixedInstances:      1071204
[INFO GPL-0009] NumDummyInstances:        21843
[INFO GPL-0010] NumNets:                   4224
[INFO GPL-0011] NumPins:                  11992
[INFO GPL-0012] DieBBox:  (  0.000  0.000 ) ( 4089.402 3464.907 ) um
[INFO GPL-0013] CoreBBox: (  2.052  2.160 ) ( 4087.368 3462.750 ) um
[INFO GPL-0016] CoreArea:            14137603.696 um^2
[INFO GPL-0017] NonPlaceInstsArea:   210016.050 um^2
[INFO GPL-0018] PlaceInstsArea:         808.461 um^2
[INFO GPL-0019] Util:                     0.006 %
[INFO GPL-0020] StdInstsArea:           808.461 um^2
[INFO GPL-0021] MacroInstsArea:           0.000 um^2
[InitialPlace]  Iter: 1 CG residual: 0.11591829 HPWL: 560276183
[InitialPlace]  Iter: 2 CG residual: 0.00597767 HPWL: 231782666
[InitialPlace]  Iter: 3 CG residual: 0.00044529 HPWL: 224291019
[InitialPlace]  Iter: 4 CG residual: 0.00002039 HPWL: 220613673
[InitialPlace]  Iter: 5 CG residual: 0.00002072 HPWL: 218262358
[InitialPlace]  Iter: 6 CG residual: 0.00001796 HPWL: 216149878
[InitialPlace]  Iter: 7 CG residual: 0.00000793 HPWL: 214449738
[INFO GPL-0031] FillerInit:NumGCells:   5988498
[INFO GPL-0032] FillerInit:NumGNets:       4224
[INFO GPL-0033] FillerInit:NumGPins:      11992
[INFO GPL-0023] TargetDensity:            0.090
[INFO GPL-0024] AvrgPlaceInstArea:        0.214 um^2
[INFO GPL-0025] IdealBinArea:             2.378 um^2
[INFO GPL-0026] IdealBinCnt:            5943988
[INFO GPL-0027] TotalBinArea:        14137603.696 um^2
[INFO GPL-0028] BinCnt:      2048   2048
[INFO GPL-0029] BinSize: (  1.995  1.690 )
[INFO GPL-0030] NumBins: 4194304
[NesterovSolve] Iter:    1 overflow: 0.999 HPWL: 193421321
[NesterovSolve] Iter:   10 overflow: 0.999 HPWL: 212522809
[NesterovSolve] Iter:   20 overflow: 0.999 HPWL: 218694372
[NesterovSolve] Iter:   30 overflow: 0.999 HPWL: 220191874
[NesterovSolve] Iter:   40 overflow: 0.999 HPWL: 219474369
[NesterovSolve] Iter:   50 overflow: 0.999 HPWL: 218568107
[NesterovSolve] Iter:   60 overflow: 0.999 HPWL: 218293189
[NesterovSolve] Iter:   70 overflow: 0.999 HPWL: 218690232
[NesterovSolve] Iter:   80 overflow: 0.999 HPWL: 218970727
[NesterovSolve] Iter:   90 overflow: 0.999 HPWL: 218808830
[NesterovSolve] Iter:  100 overflow: 0.999 HPWL: 218584859
[NesterovSolve] Iter:  110 overflow: 0.999 HPWL: 218619234
[NesterovSolve] Iter:  120 overflow: 0.999 HPWL: 218768844
[NesterovSolve] Iter:  130 overflow: 0.999 HPWL: 218803470
[NesterovSolve] Iter:  140 overflow: 0.999 HPWL: 218748247
[NesterovSolve] Iter:  150 overflow: 0.999 HPWL: 218781363
[NesterovSolve] Iter:  160 overflow: 0.999 HPWL: 218999061
[NesterovSolve] Iter:  170 overflow: 0.999 HPWL: 219403726
[NesterovSolve] Iter:  180 overflow: 0.999 HPWL: 220147808
[NesterovSolve] Iter:  190 overflow: 0.999 HPWL: 221882133
[NesterovSolve] Iter:  200 overflow: 0.999 HPWL: 225300135
[NesterovSolve] Iter:  210 overflow: 0.999 HPWL: 233334442
[NesterovSolve] Iter:  220 overflow: 0.999 HPWL: 257067856
[NesterovSolve] Iter:  230 overflow: 0.998 HPWL: 300504954
[NesterovSolve] Iter:  240 overflow: 0.997 HPWL: 358363564
[NesterovSolve] Iter:  250 overflow: 0.996 HPWL: 402778363
[NesterovSolve] Iter:  260 overflow: 0.996 HPWL: 373853388
[NesterovSolve] Iter:  270 overflow: 0.998 HPWL: 274183768
[NesterovSolve] Iter:  280 overflow: 0.996 HPWL: 372542300
[NesterovSolve] Iter:  290 overflow: 0.996 HPWL: 367397812
[NesterovSolve] Iter:  300 overflow: 0.995 HPWL: 362532811
[NesterovSolve] Iter:  310 overflow: 0.996 HPWL: 302621572
[NesterovSolve] Iter:  320 overflow: 0.995 HPWL: 297890094
[NesterovSolve] Iter:  330 overflow: 0.994 HPWL: 273507550
[NesterovSolve] Iter:  340 overflow: 0.976 HPWL: 266032946
[NesterovSolve] Iter:  350 overflow: 0.838 HPWL: 273777870
[NesterovSolve] Iter:  360 overflow: 0.731 HPWL: 284097411
[NesterovSolve] Iter:  370 overflow: 0.631 HPWL: 298892357
[NesterovSolve] Snapshot saved at iter = 371
[NesterovSolve] Iter:  380 overflow: 0.461 HPWL: 299098202
[NesterovSolve] Iter:  390 overflow: 0.389 HPWL: 291335464
[NesterovSolve] Iter:  400 overflow: 0.286 HPWL: 281111013
[INFO GPL-0100] worst slack -6.52e-10
[INFO GPL-0103] Weighted 422 nets.
[NesterovSolve] Iter:  410 overflow: 0.226 HPWL: 276579989
[INFO GPL-0075] Routability numCall: 1 inflationIterCnt: 1 bloatIterCnt: 0

Screenshots

Here is the log for Ray

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/ray/air/execution/_internal/event_manager.py", line 110, in resolve_future
    result = ray.get(future)
  File "/usr/local/lib/python3.10/dist-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/ray/_private/client_mode_hook.py", line 103, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/ray/_private/worker.py", line 2626, in get
    raise value
ray.exceptions.OutOfMemoryError: Task was killed due to the node running low on memory.
Memory on the node (IP: 172.17.0.2, ID: 08613a1644d280fea0efdb84f99710ee5d2bad64a1e6695d76856684) where the task (actor ID: 9dffce11f3bad85eb563d0f701000000, name=AutoTunerBase.__init__, pid=1104623, memory used=0.10GB) was running was 61.61GB / 62.78GB (0.981373), which exceeds the memory usage threshold of 0.95. Ray killed this worker (ID: 925b600069bbd123f1a9af55f3a3fa75842dca461ab17730c77e0f0f) because it was the most recently scheduled task; to see more information about memory usage on this node, use `ray logs raylet.out -ip 172.17.0.2`. To see the logs of the worker, use `ray logs worker-925b600069bbd123f1a9af55f3a3fa75842dca461ab17730c77e0f0f*out -ip 172.17.0.2. Top 10 memory users:
PID     MEM(GB) COMMAND
1124766 58.88
1103582 0.49    python3 distributed.py --design aes-block --platform asap7 --config ../../../../flow/designs/asap7/a...
1104623 0.10    ray::AutoTunerBase.train
1103614 0.09    /usr/local/lib/python3.10/dist-packages/ray/core/src/ray/gcs/gcs_server --log_dir=/tmp/ray/session_2...
1103685 0.07    /usr/bin/python3 /usr/local/lib/python3.10/dist-packages/ray/dashboard/dashboard.py --host=127.0.0.1...
1103814 0.06    /usr/bin/python3 -u /usr/local/lib/python3.10/dist-packages/ray/dashboard/agent.py --node-ip-address...
1103844 0.04    ray::IDLE
1103845 0.04    ray::IDLE
1103834 0.04    ray::IDLE
1103839 0.04    ray::IDLE
Refer to the documentation on how to address the out of memory issue: https://docs.ray.io/en/latest/ray-core/scheduling/ray-oom-prevention.html. Consider provisioning more memory on this node or reducing task parallelism by requesting more CPUs per task. Set max_restarts and max_task_retries to enable retry when the task crashes due to OOM. To adjust the kill threshold, set the environment variable `RAY_memory_usage_threshold` when starting Ray. To disable worker killing, set the environment variable `RAY_memory_monitor_refresh_ms` to zero.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/workspace/tools/AutoTuner/src/autotuner/distributed.py", line 953, in <module>
    analysis = tune.run(TrainClass, **tune_args)
  File "/usr/local/lib/python3.10/dist-packages/ray/tune/tune.py", line 1002, in run
    runner.step()
  File "/usr/local/lib/python3.10/dist-packages/ray/tune/execution/tune_controller.py", line 728, in step
    if not self._actor_manager.next(timeout=0.1):
  File "/usr/local/lib/python3.10/dist-packages/ray/air/execution/_internal/actor_manager.py", line 224, in next
    self._actor_task_events.resolve_future(future)
  File "/usr/local/lib/python3.10/dist-packages/ray/air/execution/_internal/event_manager.py", line 113, in resolve_future
    on_error(e)
  File "/usr/local/lib/python3.10/dist-packages/ray/air/execution/_internal/actor_manager.py", line 770, in on_error
    self._actor_task_failed(
  File "/usr/local/lib/python3.10/dist-packages/ray/air/execution/_internal/actor_manager.py", line 291, in _actor_task_failed
    raise RuntimeError(
RuntimeError: Caught unexpected exception: Task was killed due to the node running low on memory.
Memory on the node (IP: 172.17.0.2, ID: 08613a1644d280fea0efdb84f99710ee5d2bad64a1e6695d76856684) where the task (actor ID: 9dffce11f3bad85eb563d0f701000000, name=AutoTunerBase.__init__, pid=1104623, memory used=0.10GB) was running was 61.61GB / 62.78GB (0.981373), which exceeds the memory usage threshold of 0.95. Ray killed this worker (ID: 925b600069bbd123f1a9af55f3a3fa75842dca461ab17730c77e0f0f) because it was the most recently scheduled task; to see more information about memory usage on this node, use `ray logs raylet.out -ip 172.17.0.2`. To see the logs of the worker, use `ray logs worker-925b600069bbd123f1a9af55f3a3fa75842dca461ab17730c77e0f0f*out -ip 172.17.0.2. Top 10 memory users:
PID     MEM(GB) COMMAND
1124766 58.88
1103582 0.49    python3 distributed.py --design aes-block --platform asap7 --config ../../../../flow/designs/asap7/a...
1104623 0.10    ray::AutoTunerBase.train
1103614 0.09    /usr/local/lib/python3.10/dist-packages/ray/core/src/ray/gcs/gcs_server --log_dir=/tmp/ray/session_2...
1103685 0.07    /usr/bin/python3 /usr/local/lib/python3.10/dist-packages/ray/dashboard/dashboard.py --host=127.0.0.1...
1103814 0.06    /usr/bin/python3 -u /usr/local/lib/python3.10/dist-packages/ray/dashboard/agent.py --node-ip-address...
1103844 0.04    ray::IDLE
1103845 0.04    ray::IDLE
1103834 0.04    ray::IDLE
1103839 0.04    ray::IDLE
Refer to the documentation on how to address the out of memory issue: https://docs.ray.io/en/latest/ray-core/scheduling/ray-oom-prevention.html. Consider provisioning more memory on this node or reducing task parallelism by requesting more CPUs per task. Set max_restarts and max_task_retries to enable retry when the task crashes due to OOM. To adjust the kill threshold, set the environment variable `RAY_memory_usage_threshold` when starting Ray. To disable worker killing, set the environment variable `RAY_memory_monitor_refresh_ms` to zero.

Additional Context

No response

gudeh commented 2 weeks ago

Hi @luarss, could you run your issue again with the newer OR version? I believe your issue might have been solved.

luarss commented 2 weeks ago

@gudeh Can verify it no longer crashes at GPL stage. But it crashes at GRT stage instead. Will open a separate issue for this.

GRT crash logs:

OpenROAD 95fab1bf0354d1b0659023b217a20af990c93c20 
Features included (+) or not (-): +Charts +GPU +GUI +Python
This program is licensed under the BSD-3 license. See the LICENSE file for details.
Components of this program may be licensed under more restrictive licenses which must be honored.
[INFO ORD-0030] Using 16 thread(s).
global_route -guide_file ./results/asap7/aes-block/base/route.guide -congestion_report_file ./reports/asap7/aes-block/base/congestion.rpt -congestion_iterations 30 -congestion_report_iter_step 5 -verbose
[INFO GRT-0020] Min routing layer: M1
[INFO GRT-0021] Max routing layer: Pad
[INFO GRT-0022] Global adjustment: 0%
[INFO GRT-0023] Grid origin: (0, 0)
[INFO GRT-0043] No OR_DEFAULT vias defined.
[INFO GRT-0088] Layer M1      Track-Pitch = 0.0360  line-2-Via Pitch: 0.0360
[INFO GRT-0088] Layer M2      Track-Pitch = 0.0390  line-2-Via Pitch: 0.0360
[INFO GRT-0088] Layer M3      Track-Pitch = 0.0360  line-2-Via Pitch: 0.0360
[INFO GRT-0088] Layer M4      Track-Pitch = 0.0480  line-2-Via Pitch: 0.0480
[INFO GRT-0088] Layer M5      Track-Pitch = 0.0480  line-2-Via Pitch: 0.0480
[INFO GRT-0088] Layer M6      Track-Pitch = 0.0640  line-2-Via Pitch: 0.0640
[INFO GRT-0088] Layer M7      Track-Pitch = 0.0640  line-2-Via Pitch: 0.0640
[INFO GRT-0088] Layer M8      Track-Pitch = 0.0800  line-2-Via Pitch: 0.0800
[INFO GRT-0088] Layer M9      Track-Pitch = 0.0800  line-2-Via Pitch: 0.1100
[INFO GRT-0088] Layer Pad     Track-Pitch = 0.0800  line-2-Via Pitch: 2.0700
[INFO GRT-0019] Found 43 clock nets.
[INFO GRT-0001] Minimum degree: 2
[INFO GRT-0002] Maximum degree: 184
[INFO GRT-0003] Macros: 21
[INFO GRT-0043] No OR_DEFAULT vias defined.
[INFO GRT-0004] Blockages: 55216

[INFO GRT-0053] Routing resources analysis:
          Routing      Original      Derated      Resource
Layer     Direction    Resources     Resources    Reduction (%)
---------------------------------------------------------------
M1         Vertical     728801764       2111064          99.71%
M2         Horizontal   631633408      433237805          31.41%
M3         Vertical     728801764      637370533          12.55%
M4         Horizontal   534459620      468943963          12.26%
M5         Vertical     534452916      408729026          23.52%
M6         Horizontal   388695152      270434680          30.42%
M7         Vertical     388697696      338132004          13.01%
M8         Horizontal   291528936      242900393          16.68%
M9         Vertical     194380928      145774460          25.01%
Pad        Horizontal      75720         75710          0.01%
---------------------------------------------------------------

Command terminated by signal 9
Elapsed time: 7:45.76[h:]min:sec. CPU time: user 429.80 sys 35.83 (99%). Peak memory: 63591636KB.