SystemRDL / PeakRDL-regblock

Generate SystemVerilog RTL that implements a register block from compiled SystemRDL input.
http://peakrdl-regblock.readthedocs.io
GNU General Public License v3.0
52 stars 42 forks source link

Clock gating support #102

Open Blebowski opened 6 months ago

Blebowski commented 6 months ago

Hi,

does PeakRDL-regblock generated register map support clock gating ?

I see that each generated register is clock enabled, so inferred clock gating will kick in for individual registers / fields.

However, it would be good to add configuration option (e.g. enable_clock_gating). With this configuration option, there would be two clocks on each generated register map module:

Then, within the generated block, there would be a logic that would determine when does the gated clock need to be active (combine all the HW / SW access signals), and decode all such combinations into gated_clock_clk_en output. The user would then instantiate his own clock gater outside of the generated block, and connect gated_clock_clnk_en , gated_clock and ungated_clock.

This feature would allow hierarchical clock gating that would help with power consumption in ASIC designs.

amykyta3 commented 6 months ago

Modern clock synthesis tools will already implicitly detect and implement this type of power optimization. Similar to the common subexpression elimination algorithm discussed in #103, tools should have no issue detecting that the generated regfile's internal cpuif_req strobe is effectively an access gate to the entire register block for any software-writable field elements. If you see evidence in your synthesis netlists that suggest otherwise then please share it, however this type of automatic power optimization is not new in the EDA industry.

Adding an explicit clock gating mechanism opens up numerous maintainability hazards and complexities that do not seem well justified given the state of modern power optimization that already exists.

Blebowski commented 6 months ago

With the https://github.com/SystemRDL/PeakRDL-regblock/issues/103, I sort of agree (despite the personal preference) that the "tool should handle it". I think here, you are not correct. I have not seen DC to insert hierarchical clock gating. Maybe FC or Genus do so...

Yes, the tool detects strobes for individual registers, and infers you clock gate for a set of flip-flops based on its clock enable. However, this does not work hierarchically, that there would be upper-stage clock gater inferred that gates clock for the whole block. This needs to be done by hand, and it is much simpler to do so when the "clock enable" signal for such condition is decoded in the generated register map block.

With the power generally, your savings benefits are like so (first one most significant) :

  1. Savings due to the system architecture and decomposition (functional power-gating, clock gating for un-needed blocks, sleep modes, etc...).
  2. Savings due to good RTL design and using explicit clock gating where needed (e.g. clock distribution to individual IPs, and really un-gating clocks for large datapaths with many flip-flops only when they functionally need it)
  3. Savings due to the tool doing good optimizations.

Or maybe put it like so: The earlier you tackle the problem, re-think it and design it well, the better result you will get. Relying solely on tool is not a good idea...

For us this might be a "go" or "not to go feature" for the PeakRDL. Maybe in summer I will have time and try to look at implementing this.

Blebowski commented 6 months ago

Would you accept contribution of this feature assuming that it would not pollute the peakRDL code-base ? This will affect "reg-block" tool only.

Blebowski commented 6 months ago

Looking at reference manual, there are some hints on the DC could be doing this multi-stage, I need to check. However, I still would prefer doing this by hand.