Support falling edge clocks (for FFs and timing analysis)

mkurc-ant commented 2 years ago

Currently VPR has no notion of falling edge clocks. All clocks are implicitly assumed to be rising edge so is clock edge sensitivity of flip-flops / latches. Static timing analysis does not support falling edges as well.

Proposed Behaviour

There is a way to specify clock edge for clock ports in architecture models, VPR performs timing driven P&R aware of clock sink edge sensitivity, VPR can write post-synthesis (its actually post P&R) netlist + SDF files with correct clock edge sensitivities.

Current Behaviour

None of above is supported

Possible Solution

Add a new property for clock ports - edge,
Add a new internal architecture model for negedge .latch,
Support reading & parsing negedge sensitive .latch instances in BLIF files,
Enhance tatum library so that it can perform STA with presence of negedge clocks,
Add support for writing correct clock edges to Verilog, BLIF and SDF backends for post-synthesis netlists and timing annotations.

Context

Most modern FPGAs have FFs that can be configured to work on either edge of a clock, there are designs that utilize that as well.

mkurc-ant commented 2 years ago

@vaughnbetz FYI, @kmurray could you provide some hints regarding necessary enhancements to libtatum ?

vaughnbetz commented 1 year ago

Talking to Kevin:

Kevin believes Tatum doesn't really need to change.
vpr should make an inverted version of the clock behind the scenes (negative edge, with 180 degree phase shift)
Hook that clock up to the appropriate FF pins when you call Tatum to build the timing graph
All transfers (- to -, + to -, etc.) will be analyzed as the clock and its inverse will be phase related
E.g. make gclk --> gclk_posedge, gclk_negedge
vpr parses the sdc and interprets it and defines clocks for Tatum --> this code should be updated to handle negedge FFs

vaughnbetz commented 1 year ago

Adding @zhaisitong to see if he can help with this.

lpawelcz commented 1 year ago

Hi, I'm currently working on the support for falling edge clocking in VPR. I opened a draft PR to show you what we came up with: https://github.com/verilog-to-routing/vtr-verilog-to-routing/pull/2211. We would greatly appreciate your opinion on this approach.

Please do have in mind that this is still work in progress. The flow is not working entirely yet. At this point it fails after prepacking with errors related to fetching pb_graph_pin.

This PR follows the idea presented in the Possible Solution section in the opening of this issue (https://github.com/verilog-to-routing/vtr-verilog-to-routing/issues/2182#issue-1426857411) With the code from this PR (to be specific, on https://github.com/verilog-to-routing/vtr-verilog-to-routing/pull/2211/commits/505c4468e1e4e3164de77c29f8e691d9d7a999e7) I was able to generate TimingGraph visualizations with triggering edge data (TEdge(number)) on each node (data is invalid due to unfinished second .latch model support) timing_graph We want to store triggering clock edge data in CPIN nodes of each FF. Having that we should be able to check the relations between FFs in the TimingGraph and detect the case when one FF is clocked at rising edge and the second one that depends on the output of the first one is clocked at falling edge (or the other way around). When such case is detected, the timings should be modified somewhere around here.

vaughnbetz commented 1 year ago

I think a cleaner solution is to make an additional clock constraint for timing analysis (see discussion above) -- a phase related clock that is 180 degrees out of phase with the original, and connected to the -ve edge triggered FFs as the timing graph is built. Then tatum will analyze everything correctly with no changes to its code.

This localizes all code changes to parsing/setup routines in vpr, keeping the timing analyzer itself fast and clean.

lpawelcz commented 1 year ago

@vaughnbetz thank you for your response. In that case I will focus on adapting the PR to your solution.

lpawelcz commented 1 year ago

@vaughnbetz, I updated the draft PR with the initial version of falling edge support. Please have a look into what I came up with.

As a proof of concept I modified multiclock.blif design so that FFC latch would be clocked at falling edge. It can be tested with:

./vpr/vpr vtr_flow/arch/timing/EArch.xml vtr_flow/benchmarks/blif/multiclock/multiclock.blif --timing_report_detail debug --echo_file on --gen_post_synthesis_netlist on --pack_verbosity 10

TimingGraph now stores correct triggering edge data in nodes related to specific FFs. This can be easily verified on the timing graph visualization (Nodes in TimingGraph visualization have a new TEdge attribute for debugging purposes): timing_graph

I also added support for writing correct SDF files. FF cells triggered at falling edge are now annotated with negedge keyword instead of posedge.

I haven't added fake inverted clock yet.

From my understanding of the codebase, the internal architecture models are used for storing the configuration of given primitive. In case of FFs VPR assumes that only one configuration exists: FFs clocked at rising edge. After I extended the code in read_blif.cpp to support falling edge-clocked FFs it turned out that VPR uses only one model instance of given type per design. What is even more problematic is that the latch configuration of the last parsed FF is used for the whole flow. That is why a second latch model is needed - we need to store and use both configuration variants of .latch.

Those models are also extensively used in the whole VPR flow. After I added the second model for .latch I noticed that structs: t_pb_type and t_pb_graph_node should store references to both variants of those models when a .latch is processed. In order to do that, structures: t_pb_type and t_pb_graph_node were extended with secondary fields for keeping references for the falling edge-clocked variants of primitive models (t_model) and their ports (t_port).

For the time being I will focus on fixing CI failures.

lpawelcz commented 1 year ago

@vaughnbetz I updated the PR with code that adds support for inverted clocks in timing analysis - this fixed timing metrics violations in QoR tests in CI. We would greatly appreciate your feedback on this PR.

The inverted clocks are added as virtual clocks which are based on existing netlist clocks specified in SDC files with create_clock command. In order to set proper timing relations I had to add a piece of code to tatum that keeps data on which clock domain is inverted version of existing netlist clock. Another modification was to TimingTag propagation mechanism - we don't want to propagate tags from clock sources that are incompatible with given CPIN node (e.g. we don't want to propagate TimingTags related to clock domain based on non-inverted clock when we propagate tags to CPIN node which is marked as triggered at falling edge).

The remaining QoR issues are related to: device_width, device_height, device_grid_tiles, logic_block_area_total, logic_block_area_used, min_chan_width_routing_area_total, crit_path_routing_area_total, crit_path_routed_wirelength, num_clb. I'm working on fixing those.

vaughnbetz commented 1 year ago

Adding @kmurray since it would be great if he could take a look too.

verilog-to-routing / vtr-verilog-to-routing