The-OpenROAD-Project / OpenROAD

OpenROAD's unified application implementing an RTL-to-GDS Flow. Documentation at https://openroad.readthedocs.io/en/latest/
https://theopenroadproject.org/
BSD 3-Clause "New" or "Revised" License
1.35k stars 481 forks source link

report_clock_skew does not find the worst skew of any constrained path #5002

Open oharboe opened 3 weeks ago

oharboe commented 3 weeks ago

Description

Run:

  1. untar skew-consistency.tar.gz. based on make DESIGN_CONFIG=designs/asap7/mock-array/config.mk cts
  2. run ./run-me-mock-array-asap7-base.sh

Look at the timing path that report_clock_skew reports, I would have expected the "Skew" column to match up with report_clock_skew

image

Suggested Solution

Skew column should match up with report_clock_skew

Additional Context

No response

maliberty commented 2 weeks ago

@tspyrou I don't know what's going on here. Please look at it or discuss it with Cherry.

Clock clock
 313.20 source latency ces_4_2/clock ^
  96.54 source clock tree delay
-283.39 target latency ces_3_2/clock ^
 -77.04 target clock tree delay
  -7.42 CRPR
--------------
  41.89 setup skew

to

>>> rc -through ces_4_2/clock -through ces_3_2/*
Startpoint: ces_4_2 (rising edge-triggered flip-flop clocked by clock)
Endpoint: ces_3_2 (rising edge-triggered flip-flop clocked by clock)
Path Group: clock
Path Type: max

Fanout     Cap    Slew   Delay    Time   Description
-----------------------------------------------------------------------------
                          0.00    0.00   clock clock (rise edge)
                          0.00    0.00   clock source latency
     1   13.08    0.00    0.00    0.00 ^ clock (in)
                                         clock (net)
                 33.86   10.68   10.68 ^ wire2/A (BUFx16f_ASAP7_75t_R)
     1   18.19    9.80   21.51   32.19 ^ wire2/Y (BUFx16f_ASAP7_75t_R)
                                         net4218 (net)
                 71.11   22.20   54.40 ^ wire1/A (BUFx16f_ASAP7_75t_R)
     2   29.31   11.48   27.19   81.58 ^ wire1/Y (BUFx16f_ASAP7_75t_R)
                                         net4217 (net)
                155.26   48.77  130.35 ^ clkbuf_0_clock/A (BUFx24_ASAP7_75t_R)
     2   23.39   17.48   42.66  173.01 ^ clkbuf_0_clock/Y (BUFx24_ASAP7_75t_R)
                                         clknet_0_clock (net)
                 50.57   15.09  188.10 ^ clkbuf_1_1_0_clock/A (BUFx24_ASAP7_75t_R)
     2   14.67   12.39   28.68  216.78 ^ clkbuf_1_1_0_clock/Y (BUFx24_ASAP7_75t_R)
                                         clknet_1_1_0_clock (net)
                 19.14    4.77  221.55 ^ clkbuf_2_2_0_clock/A (BUFx24_ASAP7_75t_R)
     2   16.80    9.07   20.94  242.49 ^ clkbuf_2_2_0_clock/Y (BUFx24_ASAP7_75t_R)
                                         clknet_2_2_0_clock (net)
                 38.06   11.35  253.84 ^ clkbuf_3_4_0_clock/A (BUFx24_ASAP7_75t_R)
     8   77.07   30.00   28.40  282.24 ^ clkbuf_3_4_0_clock/Y (BUFx24_ASAP7_75t_R)
                                         clknet_3_4_0_clock (net)
                104.12   30.96  313.20 ^ ces_4_2/clock (Element)
     1    1.72    6.22  177.88  491.08 ^ ces_4_2/io_outs_down[61] (Element)
                                         ces_3_2_io_ins_down[61] (net)
                  6.23    0.14  491.22 ^ ces_3_2/io_ins_down[61] (Element)
                                491.22   data arrival time

                        250.00  250.00   clock clock (rise edge)
                          0.00  250.00   clock source latency
     1   12.42    0.00    0.00  250.00 ^ clock (in)
                                         clock (net)
                 31.01    9.78  259.78 ^ wire2/A (BUFx16f_ASAP7_75t_R)
     1   17.53    9.66   20.93  280.71 ^ wire2/Y (BUFx16f_ASAP7_75t_R)
                                         net4218 (net)
                 66.67   20.80  301.51 ^ wire1/A (BUFx16f_ASAP7_75t_R)
     2   28.11   11.27   26.58  328.10 ^ wire1/Y (BUFx16f_ASAP7_75t_R)
                                         net4217 (net)
                146.12   45.89  373.99 ^ clkbuf_0_clock/A (BUFx24_ASAP7_75t_R)
     2   22.19   16.98   41.59  415.59 ^ clkbuf_0_clock/Y (BUFx24_ASAP7_75t_R)
                                         clknet_0_clock (net)
                 24.71    5.90  421.49 ^ clkbuf_1_0_0_clock/A (BUFx24_ASAP7_75t_R)
     2   13.47   11.02   23.00  444.48 ^ clkbuf_1_0_0_clock/Y (BUFx24_ASAP7_75t_R)
                                         clknet_1_0_0_clock (net)
                 16.98    4.22  448.71 ^ clkbuf_2_0_0_clock/A (BUFx24_ASAP7_75t_R)
     2   15.60    8.99   20.28  468.99 ^ clkbuf_2_0_0_clock/Y (BUFx24_ASAP7_75t_R)
                                         clknet_2_0_0_clock (net)
                 34.97   10.39  479.38 ^ clkbuf_3_1_0_clock/A (BUFx24_ASAP7_75t_R)
     8   76.60   31.86   28.59  507.97 ^ clkbuf_3_1_0_clock/Y (BUFx24_ASAP7_75t_R)
                                         clknet_3_1_0_clock (net)
                 88.41   25.42  533.39 ^ ces_3_2/clock (Element)
                        -10.00  523.39   clock uncertainty
                          7.42  530.81   clock reconvergence pessimism
                         61.33  592.14   library setup time
                                592.14   data required time
-----------------------------------------------------------------------------
                                592.14   data required time
                               -491.22   data arrival time
-----------------------------------------------------------------------------
                                100.93   slack (MET)

to what we get from TimingPath::getSkew() shown in the GUI above.

tspyrou commented 2 weeks ago

The worst skew reported by report_clock_skew is 41.89 but the skew in the constrained path which is reported is much higher at 220.19

313.20 ^ ces_4_2/clock (Element) --- source side 533.39 ^ ces_3_2/clock (Element) --- dest side

oharboe commented 2 weeks ago

The worst skew reported by report_clock_skew is 41.89 but the skew in the constrained path which is reported is much higher at 220.19

313.20 ^ ces_4_2/clock (Element) --- source side 533.39 ^ ces_3_2/clock (Element) --- dest side

I can't see the number 220.19 above, where does it come from?

tspyrou commented 2 weeks ago

@oharboe Calc is open software. For license details type: help copyright [Type "exit" to exit, or "help" for help.]

; 533.39 - 313.20 220.19

tspyrou commented 2 weeks ago

To reproduce without the GUI untar skew-consistency.tar.gz. based on make DESIGN_CONFIG=designs/asap7/mock-array/config.mk cts then make DESIGN_CONFIG=designs/asap7/mock-array/config.mk open_cts to get a shell to run report_checks. rc -through ces_4_2/clock -through ces_3_2/*

oharboe commented 2 weeks ago

@oharboe Calc is open software. For license details type: help copyright [Type "exit" to exit, or "help" for help.]

; 533.39 - 313.20 220.19

Dont you have to subtract 250ps, i e. one clock period?

oharboe commented 2 weeks ago

@tspyrou If you look at the data required path above, you have 250ps to start with, because what we are looking at here is the data from one flip flop in one cycle that is captured by a second flip flop, i.e. one clock period later:

                        250.00  250.00   clock clock (rise edge)
                          0.00  250.00   clock source latency
     1   12.42    0.00    0.00  250.00 ^ clock (in)
                                         clock (net)
[deleted]
     8   76.60   31.86   28.59  507.97 ^ clkbuf_3_1_0_clock/Y (BUFx24_ASAP7_75t_R)
                                         clknet_3_1_0_clock (net)
                 88.41   25.42  533.39 ^ ces_3_2/clock (Element)

As far as skew is concerned, isn't the 250ps subtracted from 533.39?

So skew is is a much more reasonable -29.81, which is congruent with the order of magnitude of skew that report_clock_skew reports.

533.39 - 250 - 313.20 = -29.81

Worst skew can be negative, or positive, right?

jjcherry56 commented 2 weeks ago

report_clock_skew includes macro internal clock tree delay defined in the macro's liberty (the source/target_clock_tree_delay timing groups), which is reported as source/target clock tree delay. report_checks does not report or use macro internal clock tree delay because the timing checks on the macro's liberty are with respect to the clock pins. You can use report_checks to confirm the latencies, but you have to include the macro clock delays to get the same results. The math is in the report; you are ignoring two of the lines.

tspyrou commented 2 weeks ago

@oharboe please see Cherry's comment. I had forgotten that the element would have that attribute.

tspyrou commented 2 weeks ago

@jjcherry56 thanks for taking a look.

oharboe commented 2 weeks ago

Here is my attempt at getting the same number from report_clock_skew and report_checks...

report_clock_skew:

Clock clock
 313.20 source latency ces_4_2/clock ^
  96.54 source clock tree delay
-283.39 target latency ces_3_2/clock ^
 -77.04 target clock tree delay
  -7.42 CRPR
--------------
  41.89 setup skew

I can find source/target latency, to the clock pin of the macro:

                104.12   30.96  313.20 ^ ces_4_2/clock (Element)

313.20 for source.

                 88.41   25.42  533.39 ^ ces_3_2/clock (Element)

533.39 - 250ps (one clock period) = 283.39

Next, I would have to dive into the Element macro to find the clock network latency from the clock pin to the flip flop, which is different in the source and target paths. However, I didn't include the Element macro in my example above. After doing a fresh build of mock-array, I couldn't figure out how to find which flip flop is driven by the clock that has 96.54 source clock tree delay. Could report_clock_skew be amended to output some more information? Which pin on which register is being the source/target clock tree delay is in relation to(in this case a macro, the Element, but it could also be a flip-flop at the current level).

However, that still leaves unexplained the difference in Skew column and report_clock_skew. Should they not be the same?

If not, what is the Skew column displaying?

The difference in report_clock_skew 41.89 and the Skew column -8.571 is significant.

tspyrou commented 1 week ago

@maliberty can you paste in a pointer to the code that calculates what the GUI is displaying? We are unclear on what it is trying to show.

maliberty commented 1 week ago

@tspyrou we use the API Cherry added PathEnd::clkSkew and just display the return value

oharboe commented 1 week ago

With https://github.com/The-OpenROAD-Project/OpenROAD/pull/5074 I get changes in the Skew column:

image

oharboe commented 1 week ago

There is still a difference between report_clock_skew and the skew column.

Is the skew in the Timing Report measured to the input pins of the element macro?

If so, the Skew column does not take into account the skew within the elements and the skew within the element, which report_clock_skewdoes. The skew within the element is then on the order of 41.89-22.393 = ca. 20ps.

That would make the skew off by ca. 100%?

@tspyrou Thoughts?

oharboe commented 1 week ago

Trying to dive into the Element, I see for these pins a skew of 10ps, which is congruent with report_clock_skew within the Element:

>>> report_checks -through {io_outs_down\[61\]$_DFF_P_/D}
Startpoint: REG_1[61]$_DFF_P_ (rising edge-triggered flip-flop clocked by clock)
Endpoint: io_outs_down[61]$_DFF_P_
          (rising edge-triggered flip-flop clocked by clock)
Path Group: clock
Path Type: max

  Delay    Time   Description
---------------------------------------------------------
   0.00    0.00   clock clock (rise edge)
  81.39   81.39   clock network delay (propagated)
   0.00   81.39 ^ REG_1[61]$_DFF_P_/CLK (DFFHQNx2_ASAP7_75t_R)
  50.25  131.65 v REG_1[61]$_DFF_P_/QN (DFFHQNx2_ASAP7_75t_R)
  15.70  147.35 ^ _0896_/Y (NAND2x1_ASAP7_75t_R)
   0.10  147.45 ^ io_outs_down[61]$_DFF_P_/D (DFFHQNx3_ASAP7_75t_R)
         147.45   data arrival time

 250.00  250.00   clock clock (rise edge)
  72.67  322.67   clock network delay (propagated)
   0.36  323.03   clock reconvergence pessimism
         323.03 ^ io_outs_down[61]$_DFF_P_/CLK (DFFHQNx3_ASAP7_75t_R)
  -8.55  314.48   library setup time
         314.48   data required time
---------------------------------------------------------
         314.48   data required time
        -147.45   data arrival time
---------------------------------------------------------
         167.02   slack (MET)

>>> report_checks -through {REG\[61\]$_DFF_P_/D}
Startpoint: io_ins_down[61] (input port)
Endpoint: REG[61]$_DFF_P_ (rising edge-triggered flip-flop clocked by clock)
Path Group: clock
Path Type: max

  Delay    Time   Description
---------------------------------------------------------
   0.00    0.00 ^ input external delay
   0.00    0.00 ^ io_ins_down[61] (in)
  10.79   10.79 ^ input58/Y (BUFx2_ASAP7_75t_R)
   0.16   10.95 ^ REG[61]$_DFF_P_/D (DFFHQNx2_ASAP7_75t_R)
          10.95   data arrival time

  80.00   80.00   max_delay
  75.10  155.10   clock network delay (propagated)
   0.00  155.10   clock reconvergence pessimism
  -5.53  149.58   library setup time
         149.58   data required time
---------------------------------------------------------
         149.58   data required time
         -10.95   data arrival time
---------------------------------------------------------
         138.63   slack (MET)

>>> report_clock_skew
Clock clock
  82.55 source latency REG_2[3]$_DFF_P_/CLK ^
 -72.18 target latency io_outs_up[3]$_DFF_P_/CLK ^
  -0.36 CRPR
--------------
  10.01 setup skew
oharboe commented 1 week ago

Tried latest:

image