verilog-to-routing / vtr-verilog-to-routing

Verilog to Routing -- Open Source CAD Flow for FPGA Research
https://verilogtorouting.org
Other
1.02k stars 396 forks source link

Problem defining a FF with async reset with timings #617

Open mkurc-ant opened 5 years ago

mkurc-ant commented 5 years ago

I want to define a model of asynchronous reset flip-flop with timings. Those elements are very common to almost all FPGA architectures. There are neither examples of such logic elements in the documentation nor a description how to define them.

I tried to define the model according to the timing definition I have:

FF_timing

I used T_setup and T_hold tags for D to CLK, CE to CLK and CLR to CLK timings. For the CLK to Q delay the T_clock_to_Q is used. The delay between asynchronous reset CLR and the output Q is expressed using delay_constant

The architecture XML and input design BLIF file can be found in the ZIP file: fdce.zip

And the VPR complains that the timing definition is not correct (see below).

Expected Behaviour

The VPR successfully loads and parses the architecture definition.

Current Behaviour

The following error is thrown:

Error 1: ../arch.xml:-1 <pb_type> 'fdce' timing-annotation/<model> mismatch on port 'CLR' of model 'fdce', port is a sequential input with internal combinational connects but has neither min nor max T_clock_to_Q specified

When I add the missing timing then I got another one:

Error 1: ../arch.xml:39 <pb_type> timing-annotation/<model> mismatch on port 'CLR' of model 'fdce', timing annotation specifies combinational connection to port 'OUT' but the connection does not exist in the model

I also tried to directly copy the "Mixed Sequential/Combinational Block" from the "Primitive Block Timing Modeling Tutorial" (I expected that to work) and the following error was thrown:

Error 1: ../arch.xml:34 <pb_type> timing-annotation/<model> mismatch on port 'addr' of model 'single_port_ram_mixed', model specifies no clock but timing annotation specifies 'clk'

Possible Solution

Steps to Reproduce

  1. Unpack the fdce.zip file
  2. Run the run_vpr.sh script.

Context

The problem is that the CLR signal exhibit both synchronous and asynchronous features. It behaves as a combinational w.r.t. the output but there are "recovery" and "removal" timing constraints w.r.t. the clock.

The VPR suggests that I have to add the T_clock_to_Q for CLR? How does that make sense ?

Your Environment

kmurray commented 5 years ago

The problem is that the CLR signal exhibit both synchronous and asynchronous features. It behaves as a combinational w.r.t. the output but there are "recovery" and "removal" timing constraints w.r.t. the clock.

This is an area we haven't had the need to look at in much detail. Your approach of modelling the recover/removal as T_setup/T_hold (synchronous checks) seems reasonable.

At the moment VPR assume any sequential port (i.e. with a clock="") is defining a FF, and has no direct combination connections, but may connect to internal timing paths (as shown here).

This is why you get:

Error 1: ../arch.xml:-1 <pb_type> 'fdce' timing-annotation/<model> mismatch on port 'CLR' of model 'fdce', port is a sequential input with internal combinational connects but has neither min nor max T_clock_to_Q specified

since it is expecting you to define an internal timing path (which needs the T_clock_to_Q of the input flop).

I also tried to directly copy the "Mixed Sequential/Combinational Block" from the "Primitive Block Timing Modeling Tutorial" (I expected that to work)

The example was missing the clock on the sequential ports. Fixed in a2382a4b0.

mkurc-ant commented 5 years ago

@kmurray Ok, so as I understand there is no support for such a structure in the VPR right now. But async set/reset FFs are very common to FPGAs. They are present in Xilinx 7-series, Lattice iCE40 and ECP5. So I'd like to request support for them to be added to the VPR.

The generic structure of the block that I want to be supported should be like that: seq_comb_2

And the timing definition: FF_timing_no_clr_to_clk Td - <delay_constant> Tcq - <T_clock_to_Q> Ts - <T_setup> Th - <T_hold>

This is how I see it to be defined in the architecture XML:

<model name="fdce">                                                         
  <input_ports>                                                             
    <port name="CLR" combinational_sink_ports="Q"/>             
    <port is_clock="1" name="CLK"/>                                         
    <port clock="CLK" name="DIN"/>                                          
    <port clock="CLK" name="CE"/>                                           
  </input_ports>                                                            
  <output_ports>                                                            
    <port clock="CLK" name="Q"/>                                            
  </output_ports>                                                           
</model> 
<pb_type blif_model=".subckt fdce" name="fdce" num_pb="1">                
  <input name="CLR" num_pins="1"/>                                        
  <clock name="CLK" num_pins="1"/>                                        
  <input name="DIN" num_pins="1"/>                                        
  <input name="CE" num_pins="1"/>                                         
  <output name="Q" num_pins="1"/>                                         
  <T_setup port="fdce.CE"  clock="CLK" value="1e-10"/>                    
  <T_hold  port="fdce.CE"  clock="CLK" value="1e-10"/>                    
  <T_setup port="fdce.DIN" clock="CLK" value="1e-10"/>                    
  <T_hold  port="fdce.DIN" clock="CLK" value="1e-10"/>                    
  <T_clock_to_Q port="fdce.Q" clock="CLK" max="1e-10"/>                   
  <delay_constant in_port="fdce.CLR" out_port="Q" max="1e-10"/>           
</pb_type>
kmurray commented 5 years ago

I agree supporting this make sense, someone just needs to do the work.

I think your syntax proposal should actually work as defined with the current code base. The thing which is missing from it are are the recover/removal checks between CLR and CLK.

As I was trying to get at earlier, to specify setup/hold (really recover/removal) on CLR at the moment you must specify clock="CLK" on the CLR port model (so the timing graph builder knows there is a relation to be checked between CLR and CLK). However that is how we identify a port as being sequential which transforms the meaning of the <delay_constant> and combinational_sink_ports from specifying a combinational path between CLR and Q, to specifying an internal timing path between the assumed flop at CLR and Q.

I think what you are looking for is a way to specify a timing check between CLK and CLR (recovery/removal) and a combinational path between CLR and Q (no assumed flop at CLR). Currently we don't have the syntax to support that.

I'd welcome proposals for adding that syntax/support. One of the requirements (for consistency, sanity checking and building the timing graph) is that all the timing edges (dependencies/checks) must be derivable from the <model> specification, so we'd need some way to specify this there -- it would not work to specify this only via the <pb_type>.

mkurc-ant commented 5 years ago

I'll probably be able to help with implementation of that. Though I'll be back with many questions.

The first one is: Does the current implementation of the timing graph (libtatum I presume) supports both recovery/removal on IN -> CLK and combinational delay IN -> OUT ?

kmurray commented 5 years ago

This should only require changes to VPR. We're hitting a limitation of the (current) XML specification.

It is straightforward to add support for this once we can get the spec into VPR by modifying VPR's timing graph builder -- no changes to tatum required.

mkurc-ant commented 5 years ago

@kmurray The way I see it there is no point for specifying Ts/Th from CLR to CLK for async reset of a FF. Assertion of a CLR signal causes the output to change state after a constant delay regardless of the CLK. It's pure combinational.

I think that what has to be changed in the XML specification is to whether we want a register on output of the logic block or not. And I think that this can be inferred from which timings are specified and which are not.

mkurc-ant commented 5 years ago

@kmurray I created a document with my proposal of change to the architecture definition: https://docs.google.com/document/d/1vP5SZgFLhuySW2r7GL92RJCz-WxIZs4QK92Ih0qPBf0

mkurc-ant commented 5 years ago

After giving it some thought I think it might not be be necessary to change the XML syntax. But it will be required to change some meaning of the existing one.