The-OpenROAD-Project / OpenROAD

OpenROAD's unified application implementing an RTL-to-GDS Flow. Documentation at https://openroad.readthedocs.io/en/latest/
https://theopenroadproject.org/
BSD 3-Clause "New" or "Revised" License
1.56k stars 547 forks source link

Clock tree appears to be unbalanced for BoomTile running at 833 MHz #5970

Open jeffng-or opened 3 days ago

jeffng-or commented 3 days ago

Describe the bug

For megaboom v6, I upped the clock frequency to 833 MHz and the resulting clock tree appears to be unbalanced (see the branch of the tree on the far right in the viewer) and with significant skew:

BoomTile_ClockTree

>>> report_clock_skew
Clock clock
1742.25 source latency frontend/bpd/banked_predictors_0/loop/s1_update_bits_meta[32]$_DFF_P_/CLK ^
-1407.99 target latency frontend/bpd/banked_predictors_1/btb/meta_1_ext/R0_clk ^
 -17.80 CRPR
--------------
 316.47 setup skew

>>> report_clock_latency
Clock clock
rise -> rise
    min     max
   0.00    0.00 source latency
1440.04         network latency dcache/data/array_3_1_ext/R0_clk
        1838.54 network latency dcache/data/io_resp_1_0_REG[75]$_DFF_P_/CLK
---------------
1440.04 1838.54 latency
         398.50 skew

fall -> fall
    min     max
   0.00    0.00 source latency
1527.54         network latency dcache/data/array_3_1_ext/R0_clk
        1954.27 network latency dcache/data/io_resp_1_0_REG[75]$_DFF_P_/CLK
---------------
1527.54 1954.27 latency
         426.73 skew

Tom mentioned that sometimes the clock tree will be unbalanced when connecting to macros, so I'd like to get some feedback on whether that's the case here. Note that prior versions of BoomTile with a slower clock generated a balanced clock tree with not as much skew.

Expected Behavior

Balanced clock tree with minimal skew

Environment

[WARNING] Your current OpenROAD version is outdated.
It is recommened to pull the latest changes.
If problem persists, file a github issue with the re-producible test case.
kernel: Linux 6.5.0-1025-gcp
os: Ubuntu 22.04.4 LTS (Jammy Jellyfish)
cmake version 3.24.2
-- The CXX compiler identification is GNU 11.4.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- OpenROAD version: v2.0-16316-gf9cfd9383
-- System name: Linux
-- Compiler: GNU 11.4.0
-- Build type: RELEASE
-- Install prefix: /usr/local
-- C++ Standard: 17
-- C++ Standard Required: ON
-- C++ Extensions: OFF
-- The C compiler identification is GNU 11.4.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Performing Test C_COMPILER_SUPPORTS__-Wall
-- Performing Test C_COMPILER_SUPPORTS__-Wall - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wall
-- Performing Test CXX_COMPILER_SUPPORTS__-Wall - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-array-bounds
-- Performing Test C_COMPILER_SUPPORTS__-Wno-array-bounds - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-array-bounds
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-array-bounds - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-nonnull
-- Performing Test C_COMPILER_SUPPORTS__-Wno-nonnull - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-nonnull
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-nonnull - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-maybe-uninitialized
-- Performing Test C_COMPILER_SUPPORTS__-Wno-maybe-uninitialized - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-maybe-uninitialized
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-maybe-uninitialized - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-format-overflow
-- Performing Test C_COMPILER_SUPPORTS__-Wno-format-overflow - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-format-overflow
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-format-overflow - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-unused-variable
-- Performing Test C_COMPILER_SUPPORTS__-Wno-unused-variable - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-unused-variable
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-unused-variable - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-unused-function
-- Performing Test C_COMPILER_SUPPORTS__-Wno-unused-function - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-unused-function
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-unused-function - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-write-strings
-- Performing Test C_COMPILER_SUPPORTS__-Wno-write-strings - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-write-strings
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-write-strings - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-sign-compare
-- Performing Test C_COMPILER_SUPPORTS__-Wno-sign-compare - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-sign-compare
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-sign-compare - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-deprecated
-- Performing Test C_COMPILER_SUPPORTS__-Wno-deprecated - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-deprecated
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-deprecated - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-c++11-narrowing
-- Performing Test C_COMPILER_SUPPORTS__-Wno-c++11-narrowing - Failed
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-c++11-narrowing
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-c++11-narrowing - Failed
-- Performing Test C_COMPILER_SUPPORTS__-Wno-register
-- Performing Test C_COMPILER_SUPPORTS__-Wno-register - Failed
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-register
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-register - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-format
-- Performing Test C_COMPILER_SUPPORTS__-Wno-format - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-format
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-format - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-reserved-user-defined-literal
-- PerformingCMake Warning (dev) at src/sta/CMakeLists.txt:32 (option):
  Policy CMP0077 is not set: option() honors normal variables.  Run "cmake
  --help-policy CMP0077" for policy details.  Use the cmake_policy command to
  set the policy and suppress this warning.

  For compatibility with older versions of CMake, option is clearing the
  normal variable 'USE_TCL_READLINE'.
This warning is for project developers.  Use -Wno-dev to suppress it.

 Test C_COMPILER_SUPPORTS__-Wno-reserved-user-defined-literal - Failed
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-reserved-user-defined-literal
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-reserved-user-defined-literal - Failed
-- Performing Test C_COMPILER_SUPPORTS__-fpermissive
-- Performing Test C_COMPILER_SUPPORTS__-fpermissive - Failed
-- Performing Test CXX_COMPILER_SUPPORTS__-fpermissive
-- Performing Test CXX_COMPILER_SUPPORTS__-fpermissive - Success
-- Performing Test C_COMPILER_SUPPORTS__-x
-- Performing Test C_COMPILER_SUPPORTS__-x - Failed
-- Performing Test CXX_COMPILER_SUPPORTS__-x
-- Performing Test CXX_COMPILER_SUPPORTS__-x - Failed
-- Performing Test C_COMPILER_SUPPORTS__c++
-- Performing Test C_COMPILER_SUPPORTS__c++ - Failed
-- Performing Test CXX_COMPILER_SUPPORTS__c++
-- Performing Test CXX_COMPILER_SUPPORTS__c++ - Failed
-- Performing Test C_COMPILER_SUPPORTS__-std=c++17
-- Performing Test C_COMPILER_SUPPORTS__-std=c++17 - Failed
-- Performing Test CXX_COMPILER_SUPPORTS__-std=c++17
-- Performing Test CXX_COMPILER_SUPPORTS__-std=c++17 - Success
-- Performing Test C_COMPILER_SUPPORTS__-Wno-unused-but-set-variable
-- Performing Test C_COMPILER_SUPPORTS__-Wno-unused-but-set-variable - Success
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-unused-but-set-variable
-- Performing Test CXX_COMPILER_SUPPORTS__-Wno-unused-but-set-variable - Success
-- TCL library: /usr/lib/x86_64-linux-gnu/libtcl.so
-- TCL header: /usr/include/tcl/tcl.h
-- TCL readline library: /usr/lib/x86_64-linux-gnu/libtclreadline.so
-- TCL readline header: /usr/include/x86_64-linux-gnu
-- Found SWIG: /home/jeffng/dev/main/OpenROAD-flow-scripts/dependencies/bin/swig (found suitable version "4.1.0", minimum required is "4.0")  
-- Using SWIG >= 4.1.0 -flatstaticmethod flag for python
-- Found Boost: /home/jeffng/dev/main/OpenROAD-flow-scripts/dependencies/lib/cmake/Boost-1.80.0/BoostConfig.cmake (found version "1.80.0")  
-- boost: 1.80.0
-- Found GTest: /home/jeffng/dev/main/OpenROAD-flow-scripts/dependencies/lib/cmake/GTest/GTestConfig.cmake (found version "1.13.0")  
-- GTest: 1.13.0
-- Found Python3: /usr/include/python3.10 (found version "3.10.12") found components: Development Development.Module Development.Embed 
-- Found ZLIB: /usr/lib/x86_64-linux-gnu/libz.so (found version "1.2.11") 
-- Found Threads: TRUE  
-- spdlog: 1.8.1
-- Found BISON: /usr/bin/bison (found version "3.8.2") 
-- Could NOT find Doxygen (missing: DOXYGEN_EXECUTABLE) 
-- STA version: 2.6.0
-- STA git sha: b5f3a02b33b8ae1739ace8a329fde94434711dd6
-- System name: Linux
-- Compiler: GNU 11.4.0
-- Build type: RELEASE
-- Build CXX_FLAGS: -O3 -DNDEBUG
-- Install prefix: /usr/local
-- Found FLEX: /usr/bin/flex (found version "2.6.4") 
-- TCL library: /usr/lib/x86_64-linux-gnu/libtcl.so
-- TCL header: /usr/include/tcl/tcl.h
-- TCL readline library: /usr/lib/x86_64-linux-gnu/libtclreadline.so
-- TCL readline header: /usr/include/x86_64-linux-gnu/tclreadline.h
-- CUDD library: /usr/local/lib/libcudd.a
-- CUDD header: /usr/local/include/cudd.h
-- SSTA: 0
-- Found SWIG: /home/jeffng/dev/main/OpenROAD-flow-scripts/dependencies/bin/swig (found suitable version "4.1.0", minimum required is "3.0")  
-- STA executable: /home/jeffng/dev/main/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/app/sta
-- Found re2: /opt/or-tools/lib/cmake/re2/re2Config.cmake (found version "11.0.0") 
-- Found Clp: /opt/or-tools/lib/cmake/Clp/ClpConfig.cmake (found version "1.17.7") 
-- Found Cbc: /opt/or-tools/lib/cmake/Cbc/CbcConfig.cmake (found version "2.10.7") 
-- Found SCIP: /opt/or-tools/lib/cmake/scip/scip-config.cmake (found version "9.0.0") 
-- Found OpenMP_CXX: -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
-- Found OR-Tools: /opt/or-tools/lib/cmake/ortools (version: 9.10.4067)
-- TCL library: /usr/lib/x86_64-linux-gnu/libtcl.so
-- TCL header: /usr/include/tcl/tcl.h
-- Found OpenMP_C: -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
-- Found OpenMP: TRUE (found version "4.5")  
-- GUI is enabled
-- Charts widget is enabled
-- FounNumber of processor cores: 32
d Boost: /home/jeffng/dev/main/OpenROAD-flow-scripts/dependencies/lib/cmake/Boost-1.80.0/BoostConfig.cmake (found version "1.80.0") found components: serialization 
-- Could NOT find VTune (missing: VTune_LIBRARIES VTune_INCLUDE_DIRS) 
-- Found Boost: /home/jeffng/dev/main/OpenROAD-flow-scripts/dependencies/lib/cmake/Boost-1.80.0/BoostConfig.cmake (found suitable version "1.80.0", minimum required is "1.78")  
-- TCL library: /usr/lib/x86_64-linux-gnu/libtcl.so
-- TCL header: /usr/include/tcl/tcl.h
-- Found Boost: /home/jeffng/dev/main/OpenROAD-flow-scripts/dependencies/lib/cmake/Boost-1.80.0/BoostConfig.cmake (found version "1.80.0") found components: serialization system thread 
-- Found Boost: /home/jeffng/dev/main/OpenROAD-flow-scripts/dependencies/lib/cmake/Boost-1.80.0/BoostConfig.cmake (found version "1.80.0")  
-- Found Eigen3: /home/jeffng/dev/main/OpenROAD-flow-scripts/dependencies/share/eigen3/cmake/Eigen3Config.cmake (found version "3.4.0") 
-- TCL readline enabled
-- Tcl Extended disabled
-- Python3 enabled
-- Configuring done
-- Generating done
-- Build files have been written to: /tmp/tmp.Bt0W42tFUE

To Reproduce

  1. unpack the tarball https://drive.google.com/file/d/1eVCheqXsRstEgBQLqIi-H5i4qwO9Z6Dh/view?usp=sharing
  2. cd cts_BoomTile_asap7_base_2024-10-16_10-40
  3. ./run-me-BoomTile-asap7-base.sh gui <- I added the second argument to just pull up the GUI instead of running CTS
  4. type gui::show if the GUI doesn't come up (might be fixed)
  5. view clock tree in Clock Tree Viewer

I kept 3_place.odb in the tarball in case it's helpful.

Relevant log output

No response

Screenshots

No response

Additional Context

No response

maliberty commented 3 days ago

@jeffng-or if you zoom in the leafs on the skewed side you can select them to see what instances they are.

arthurjolo commented 3 days ago

@jeffng-or Do you know if there are any clock gaters on this design? Clock gaters have some issues currently.

jeffng-or commented 3 days ago

@jeffng-or Do you know if there are any clock gaters on this design? Clock gaters have some issues currently.

AFAIK, no clock gaters. The only cells used in the clock tree look to be bufs or invs:

(BUFx10_ASAP7_75t_R) (BUFx12_ASAP7_75t_R) (BUFx12f_ASAP7_75t_R) (BUFx16f_ASAP7_75t_R) (BUFx24_ASAP7_75t_R) (BUFx4f_ASAP7_75t_R) (BUFx6f_ASAP7_75t_R) (CKINVDCx10_ASAP7_75t_R) (CKINVDCx11_ASAP7_75t_R) (CKINVDCx12_ASAP7_75t_R) (CKINVDCx14_ASAP7_75t_R) (CKINVDCx16_ASAP7_75t_R) (CKINVDCx20_ASAP7_75t_R) (CKINVDCx5p33_ASAP7_75t_R) (CKINVDCx6p67_ASAP7_75t_R) (CKINVDCx8_ASAP7_75t_R) (CKINVDCx9p33_ASAP7_75t_R) (INVx13_ASAP7_75t_R) (INVx3_ASAP7_75t_R) (INVx5_ASAP7_75t_R) (INVx6_ASAP7_75t_R) (INVx8_ASAP7_75t_R) (INVxp33_ASAP7_75t_R) (INVxp67_ASAP7_75t_R)

jeffng-or commented 3 days ago

@jeffng-or if you zoom in the leafs on the skewed side you can select them to see what instances they are.

The far right branch connects to one macro: dcache/data/array_1_0_ext/R0_clk.

The other long branch nearer to the middle of the clock tree is clknet_leaf_402_clock_regs, which connects to:

maliberty commented 3 days ago

It branches off the tree very high up. If you look at where it branches do you see a gate?

jeffng-or commented 3 days ago

It branches off the tree very high up. If you look at where it branches do you see a gate?

I don't think so. I checked the connections at the colored dots in the clock tree:

image

In case you want to look at the clock tree in more detail, I have the output from a clock tree connectivity dumper that I wrote (no guarantee that it's totally correct). The output is a text file which can be found in: https://drive.google.com/file/d/1bL4rkNu9erDrZYoNKofA7MbXbJbnxByT/view?usp=sharing

maliberty commented 3 days ago

@arthurjolo is this due to splitting macros from std cells?

arthurjolo commented 2 days ago

I am going to open the design and take a close look on it, but I don't think this is due to the splitting of macros from std cells. On the macros branch Jeff mentioned that there are delay buffer so the average arrival time on the macro branch was smaller then on the std cells, so the other macros seem to have a better arrival time. I believe that CTS did a poor job connecting to dcache/data/array_1_0_ext/R0_clk, I am going to try to understand why.

arthurjolo commented 2 days ago

One thing that I noticed from Jeff's dump is that on the array_1_0_ext/R0_clk path there are 7 wire buffers between the leaf clk buffer and the CK pin, while on other leafs there are 3 or less buffers.