Open mkurc-ant opened 2 years ago
@vaughnbetz FYI, @kmurray could you provide some hints regarding necessary enhancements to libtatum
?
Talking to Kevin:
Adding @zhaisitong to see if he can help with this.
Hi, I'm currently working on the support for falling edge clocking in VPR. I opened a draft PR to show you what we came up with: https://github.com/verilog-to-routing/vtr-verilog-to-routing/pull/2211. We would greatly appreciate your opinion on this approach.
Please do have in mind that this is still work in progress. The flow is not working entirely yet. At this point it fails after prepacking with errors related to fetching pb_graph_pin
.
This PR follows the idea presented in the Possible Solution
section in the opening of this issue (https://github.com/verilog-to-routing/vtr-verilog-to-routing/issues/2182#issue-1426857411)
With the code from this PR (to be specific, on https://github.com/verilog-to-routing/vtr-verilog-to-routing/pull/2211/commits/505c4468e1e4e3164de77c29f8e691d9d7a999e7) I was able to generate TimingGraph visualizations with triggering edge data (TEdge(number)
) on each node (data is invalid due to unfinished second .latch
model support)
We want to store triggering clock edge data in CPIN
nodes of each FF. Having that we should be able to check the relations between FFs in the TimingGraph
and detect the case when one FF is clocked at rising edge and the second one that depends on the output of the first one is clocked at falling edge (or the other way around). When such case is detected, the timings should be modified somewhere around here.
I think a cleaner solution is to make an additional clock constraint for timing analysis (see discussion above) -- a phase related clock that is 180 degrees out of phase with the original, and connected to the -ve edge triggered FFs as the timing graph is built. Then tatum will analyze everything correctly with no changes to its code.
This localizes all code changes to parsing/setup routines in vpr, keeping the timing analyzer itself fast and clean.
@vaughnbetz thank you for your response. In that case I will focus on adapting the PR to your solution.
@vaughnbetz, I updated the draft PR with the initial version of falling edge support. Please have a look into what I came up with.
As a proof of concept I modified multiclock.blif
design so that FFC
latch would be clocked at falling edge. It can be tested with:
./vpr/vpr vtr_flow/arch/timing/EArch.xml vtr_flow/benchmarks/blif/multiclock/multiclock.blif --timing_report_detail debug --echo_file on --gen_post_synthesis_netlist on --pack_verbosity 10
TimingGraph
now stores correct triggering edge data in nodes related to specific FFs. This can be easily verified on the timing graph visualization (Nodes in TimingGraph
visualization have a new TEdge
attribute for debugging purposes):
I also added support for writing correct SDF files. FF cells triggered at falling edge are now annotated with negedge
keyword instead of posedge
.
I haven't added fake inverted clock yet.
From my understanding of the codebase, the internal architecture models are used for storing the configuration of given primitive. In case of FFs VPR assumes that only one configuration exists: FFs clocked at rising edge. After I extended the code in read_blif.cpp
to support falling edge-clocked FFs it turned out that VPR uses only one model instance of given type per design. What is even more problematic is that the latch configuration of the last parsed FF is used for the whole flow. That is why a second latch model is needed - we need to store and use both configuration variants of .latch
.
Those models are also extensively used in the whole VPR flow. After I added the second model for .latch
I noticed that structs: t_pb_type
and t_pb_graph_node
should store references to both variants of those models when a .latch
is processed.
In order to do that, structures: t_pb_type
and t_pb_graph_node
were extended with secondary fields for keeping references for the falling edge-clocked variants of primitive models (t_model
) and their ports (t_port
).
For the time being I will focus on fixing CI failures.
@vaughnbetz I updated the PR with code that adds support for inverted clocks in timing analysis - this fixed timing metrics violations in QoR tests in CI. We would greatly appreciate your feedback on this PR.
The inverted clocks are added as virtual clocks which are based on existing netlist clocks specified in SDC files with create_clock
command. In order to set proper timing relations I had to add a piece of code to tatum
that keeps data on which clock domain is inverted version of existing netlist clock. Another modification was to TimingTag propagation mechanism - we don't want to propagate tags from clock sources that are incompatible with given CPIN
node (e.g. we don't want to propagate TimingTags related to clock domain based on non-inverted clock when we propagate tags to CPIN
node which is marked as triggered at falling edge).
The remaining QoR issues are related to: device_width, device_height, device_grid_tiles, logic_block_area_total, logic_block_area_used, min_chan_width_routing_area_total, crit_path_routing_area_total, crit_path_routed_wirelength, num_clb. I'm working on fixing those.
Adding @kmurray since it would be great if he could take a look too.
Currently VPR has no notion of falling edge clocks. All clocks are implicitly assumed to be rising edge so is clock edge sensitivity of flip-flops / latches. Static timing analysis does not support falling edges as well.
Proposed Behaviour
There is a way to specify clock edge for clock ports in architecture models, VPR performs timing driven P&R aware of clock sink edge sensitivity, VPR can write post-synthesis (its actually post P&R) netlist + SDF files with correct clock edge sensitivities.
Current Behaviour
None of above is supported
Possible Solution
edge
,.latch
,.latch
instances in BLIF files,tatum
library so that it can perform STA with presence of negedge clocks,Context
Most modern FPGAs have FFs that can be configured to work on either edge of a clock, there are designs that utilize that as well.