amaranth-lang / amaranth

A modern hardware definition language and toolchain based on Python
https://amaranth-lang.org/docs/amaranth/
BSD 2-Clause "Simplified" License
1.57k stars 174 forks source link

Vivado CDC constraints #227

Closed nmigen-issue-migration closed 5 years ago

nmigen-issue-migration commented 5 years ago

Issue by dlharmon Sunday Sep 22, 2019 at 20:56 GMT Originally opened as https://github.com/m-labs/nmigen/pull/227


Related: #212

This tags the first register in each MultiReg or ResetSynchronizer with the attribute nmigen_async_ff and then applies a false path and max delay constraint to all registers tagged with that attribute in the .xdc file.

The max delay defaults to 5 ns and has an override, max_delay where it can be changed for the > whole project. It's possible to make this an argument to MultiReg instead, but is more complex. > git commit -m "add clock domain crossing constraints on Vivado This tags the first register in each MultiReg or ResetSynchronizer with the attribute nmigen_async_ff and then applies a false path and max delay constraint to all registers tagged with that attribute in the .xdc file.

The max delay defaults to 5 ns and has an override, max_delay where it can be changed for the whole project. It's possible to make this an optional argument to MultiReg instead, but is more complex. It would probably work to set nmigen_async_ff to the desired delay rather than just TRUE. I'm not sure how hard it would be to extract that in the TCL or if it would be easier to keep a dict of all used delay values and put a line for each into the .xdc file.


dlharmon included the following code: https://github.com/m-labs/nmigen/pull/227/commits

nmigen-issue-migration commented 5 years ago

Comment by codecov[bot] Sunday Sep 22, 2019 at 20:58 GMT


Codecov Report

Merging #227 into master will decrease coverage by 0.05%. The diff coverage is 50%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #227      +/-   ##
==========================================
- Coverage   82.38%   82.32%   -0.06%     
==========================================
  Files          33       33              
  Lines        5483     5489       +6     
  Branches     1172     1174       +2     
==========================================
+ Hits         4517     4519       +2     
- Misses        834      836       +2     
- Partials      132      134       +2
Impacted Files Coverage Δ
nmigen/lib/cdc.py 84% <50%> (-6.91%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update da53048...b78ef18. Read the comment docs.

nmigen-issue-migration commented 5 years ago

Comment by dlharmon Sunday Sep 22, 2019 at 21:04 GMT


References: https://forums.xilinx.com/t5/Timing-Analysis/how-to-constrain-a-CDC/td-p/748174

I expect a few more constraint requirements once #40 is merged due to a timing path between the write clock and async outputs on the LUT RAMs. It's a challenge to deal with: https://forums.xilinx.com/t5/Vivado-TCL-Community/set-disable-timing-for-hierarchical-cell/td-p/495808 Alternately, set the clocks to be asynchronous, but getting correct clock names is a challenge.

nmigen-issue-migration commented 5 years ago

Comment by dlharmon Sunday Sep 22, 2019 at 22:15 GMT


Xilinx UG903 is self contradictory and the set_false_path constraint appears to override the set_max_delay constraint. I'm looking into how to solve this.

nmigen-issue-migration commented 5 years ago

Comment by dlharmon Monday Sep 23, 2019 at 02:14 GMT


Ideally, we would use set_max_delay -datapath_only, but that option requires providing -from with the source clock greatly complicating things.

I also noticed that there is a "critical warning" generated when the added lines are present in the .xdc file and there are no flip flops with the nmigen_async_ff attribute.

Direction would be appreciated.

nmigen-issue-migration commented 5 years ago

Comment by whitequark Monday Sep 23, 2019 at 06:55 GMT


This tags the first register in each MultiReg or ResetSynchronizer with the attribute nmigen_async_ff

This isn't directly related to the PR at hand (I'll look more into how Vivado works before commenting here), but I have a question you might answer regarding attributes. nMigen emits some internal attributes like nmigen.hierarchy, nmigen.decoding, src (which should really be nmigen.src), etc. Vivado complains about these. Is here a way to tell it "just ignore any attribute that starts with nmigen", other than the nmigen_async_ff and other semantically important ones?

nmigen-issue-migration commented 5 years ago

Comment by whitequark Monday Sep 23, 2019 at 07:07 GMT


Ideally, we would use set_max_delay -datapath_only, but that option requires providing -from with the source clock greatly complicating things.

Can you elaborate on the exact nature of complication here?

nmigen-issue-migration commented 5 years ago

Comment by whitequark Monday Sep 23, 2019 at 07:48 GMT


Vivado complains about these

Actually, I was wrong. Vivado doesn't mind. ISE complains:

WARNING:Xst:37 - Detected unknown constraint/property "src". This constraint/property is not supported by the current software release and will be ignored.
WARNING:Xst:37 - Detected unknown constraint/property "src". This constraint/property is not supported by the current software release and will be ignored.
WARNING:Xst:37 - Detected unknown constraint/property "nmigen.hierarchy". This constraint/property is not supported by the current software release and will be ignored.
nmigen-issue-migration commented 5 years ago

Comment by dlharmon Monday Sep 23, 2019 at 15:18 GMT


Ideally, we would use set_max_delay -datapath_only, but that option requires providing -from with the source clock greatly complicating things.

Can you elaborate on the exact nature of complication here?

We would either need an i_domain argument to MultiReg and to know the final Verilog name of the clock or some way to find the clock source of the i argument (may not exist if it came from a pin which is OK or could be combinatorial logic from multiple domains). I assumed the latter to be difficult and possibly would add much more complexity than required. If that's not the case, it may provide a bit more flexibility, especially for things like a LUTRAM based FIFO. It might be worth an optional i_domain argument for the LUTRAM FIFO case.

I did find that instead of the set_min_delay -1000 -to <REG>, a better way to do it is to add set_false_path -hold -to <REG>. I'll push that change to this branch after testing it.

I think I'll need to add a dict keyed with (source_clk, max_delay) to platform for keeping track of which constraints need to be added to the .xdc file. Of course, if we don't do per CDC max_delay or set_max_delay -datapath_only -from <X> -to <Y> it could simply be a bool to enable adding to the constraints file.

nmigen-issue-migration commented 5 years ago

Comment by dlharmon Monday Sep 23, 2019 at 15:25 GMT


nMigen emits some internal attributes like nmigen.hierarchy, nmigen.decoding, src (which should really be nmigen.src), etc. Vivado complains about these. Is here a way to tell it "just ignore any attribute that starts with nmigen", other than the nmigen_async_ff and other semantically important ones?

As you found, Vivado does seem to ignore any attribute it doesn't use. It seems safe enough to assume it won't use anything starting with nmigen. Perhaps, I should swap my nmigen_async_ff to nmigen.async_ff for consistency.

I do see Vivado complaining about the init attribute on Verilog wires.

nmigen-issue-migration commented 5 years ago

Comment by whitequark Monday Sep 23, 2019 at 15:30 GMT


We would either need an i_domain argument to MultiReg and to know the final Verilog name of the clock or some way to find the clock source of the i argument (may not exist if it came from a pin which is OK or could be combinatorial logic from multiple domains). I assumed the latter to be difficult and possibly would add much more complexity than required.

We'll have this information once #4 is implemented, but right now this isn't viable.

If that's not the case, it may provide a bit more flexibility, especially for things like a LUTRAM based FIFO. It might be worth an optional i_domain argument for the LUTRAM FIFO case.

I don't understand this statement. We already have r_domain and w_domain in AsyncFIFO. Or do you mean some other kind of FIFO?

I think I'll need to add a dict keyed with (source_clk, max_delay) to platform for keeping track of which constraints need to be added to the .xdc file. Of course, if we don't do per CDC max_delay or set_max_delay -datapath_only -from <X> -to <Y> it could simply be a bool to enable adding to the constraints file.

I would personally opt for a simpler solution: grab the attribute with Tcl and set it, again, with Tcl. I'll throw together a snippet soon so you don't have to do it yourself.

nmigen-issue-migration commented 5 years ago

Comment by whitequark Monday Sep 23, 2019 at 15:30 GMT


Perhaps, I should swap my nmigen_async_ff to nmigen.async_ff for consistency.

Sounds good.

I do see Vivado complaining about the init attribute on Verilog wires.

This is #220. Unfortunately an upstream issue.

nmigen-issue-migration commented 5 years ago

Comment by dlharmon Monday Sep 23, 2019 at 15:44 GMT


If that's not the case, it may provide a bit more flexibility, especially for things like a LUTRAM based FIFO. It might be worth an optional i_domain argument for the LUTRAM FIFO case.

I don't understand this statement. We already have r_domain and w_domain in AsyncFIFO. Or do you mean some other kind of FIFO?

I was referring to MultiReg/FFSynchronizer which is used within AsyncFIFO and doesn't currently get passed the input clock. It's simple enough to add an argument i_domain = None to FFSynchronizer if it's not None add the -from and -datapath_only arguments to set_max_delay, omit set_false_path.

The LUTRAM on 7series has a timing path from the write clock to the async read data. I want to make that a false path / max delay while keeping the path from read address to read data. Having the from clock makes that possible.

nmigen-issue-migration commented 5 years ago

Comment by dlharmon Monday Sep 23, 2019 at 15:46 GMT


I would personally opt for a simpler solution: grab the attribute with Tcl and set it, again, with Tcl. I'll throw together a snippet soon so you don't have to do it yourself.

My knowledge of Tcl is quite limited, so I'll leave that to you.

nmigen-issue-migration commented 5 years ago

Comment by whitequark Monday Sep 23, 2019 at 16:50 GMT


@dlharmon I think I understand how to do this in Tcl, but could you please explain what set_max_delay actually does here? I've been looking at various documentation for quite some time and I still do not understand why it is useful.

nmigen-issue-migration commented 5 years ago

Comment by jordens Monday Sep 23, 2019 at 17:05 GMT


This one is good and what I based the migen async reset synchronizer constraints on.

There are more good posts by that guy in the xilinx forums: http://www.colmryan.org/avrums-clock-domain-crossing-widsom.html

nmigen-issue-migration commented 5 years ago

Comment by dlharmon Monday Sep 23, 2019 at 17:22 GMT


@dlharmon I think I understand how to do this in Tcl, but could you please explain what set_max_delay actually does here? I've been looking at various documentation for quite some time and I still do not understand why it is useful.

The first link mentioned by @jordens does a decent job explaining it.

Simply adding a false path to the flip flop is the easiest, but allows the place and route unlimited delay in that net which is problematic in some cases. For instance, I have a bunch of 250 MHz same frequency, unknown phase shift crossings that need to come out of reset within a few cycles of each other, so a 5 ns max delay works well there.

nmigen-issue-migration commented 5 years ago

Comment by dlharmon Monday Sep 23, 2019 at 17:26 GMT


Screenshot from 2019-09-23 11-26-25

nmigen-issue-migration commented 5 years ago

Comment by whitequark Monday Sep 23, 2019 at 17:32 GMT


For instance, I have a bunch of 250 MHz same frequency, unknown phase shift crossings that need to come out of reset within a few cycles of each other, so a 5 ns max delay works well there.

Do I understand it correctly that set_max_delay reduces (such that it meets the constraint) the uncertainty in phase between different synchronizers?

nmigen-issue-migration commented 5 years ago

Comment by dlharmon Monday Sep 23, 2019 at 17:39 GMT


Do I understand it correctly that set_max_delay reduces (such that it meets the constraint) the uncertainty in phase between different synchronizers?

Yes, that's a great way to put it.

nmigen-issue-migration commented 5 years ago

Comment by whitequark Monday Sep 23, 2019 at 18:26 GMT


@dlharmon Here is my take on adding false path and max delay constraints for Vivado:

diff --git a/nmigen/vendor/xilinx_7series.py b/nmigen/vendor/xilinx_7series.py
index b25c4e9..30dfa64 100644
--- a/nmigen/vendor/xilinx_7series.py
+++ b/nmigen/vendor/xilinx_7series.py
@@ -78,6 +78,10 @@ class Xilinx7SeriesPlatform(TemplatedPlatform):
             {% endfor %}
             {{get_override("script_after_read")|default("# (script_after_read placeholder)")}}
             synth_design -top {{name}} -part {{platform.device}}{{platform.package}}-{{platform.speed}}
+            set_false_path -hold -to [get_cells -hier -filter {nmigen.vivado.false_path == TRUE}]
+            foreach cell [get_cells -hier -filter {nmigen.vivado.max_delay != ""}] {
+                set_max_delay [get_property nmigen.vivado.max_delay $cell] -to $cell
+            }
             {{get_override("script_after_synth")|default("# (script_after_synth placeholder)")}}
             report_timing_summary -file {{name}}_timing_synth.rpt
             report_utilization -hierarchical -file {{name}}_utilization_hierachical_synth.rpt
@@ -367,7 +371,26 @@ class Xilinx7SeriesPlatform(TemplatedPlatform):
                         reset=ff_sync._reset, reset_less=ff_sync._reset_less,
                         attrs={"ASYNC_REG": "TRUE"})
                  for index in range(ff_sync._stages)]
+        flops[0].attrs["nmigen.vivado.false_path"] = "TRUE"
+        flops[0].attrs["nmigen.vivado.max_delay"]  = "5.0" # FIXME
         for i, o in zip((ff_sync.i, *flops), flops):
             m.d[ff_sync._o_domain] += o.eq(i)
         m.d.comb += ff_sync.o.eq(flops[-1])
         return m
+
+    def get_reset_sync(self, reset_sync):
+        m = Module()
+        m.domains += ClockDomain("reset_sync", async_reset=True, local=True)
+        flops = [Signal(1, name="stage{}".format(index), reset=1,
+                        attrs={"ASYNC_REG": "TRUE"})
+                 for index in range(reset_sync._stages)]
+        flops[0].attrs["nmigen.vivado.false_path"] = "TRUE"
+        flops[0].attrs["nmigen.vivado.max_delay"]  = "5.0" # FIXME
+        for i, o in zip((0, *flops), flops):
+            m.d.reset_sync += o.eq(i)
+        m.d.comb += [
+            ClockSignal("reset_sync").eq(ClockSignal(reset_sync._domain)),
+            ResetSignal("reset_sync").eq(reset_sync.arst),
+            ResetSignal(reset_sync._domain).eq(flops[-1])
+        ]
+        return m

It looks like you can't use arbitrary Tcl commands in XDC files (AR#54842), so I moved it to the main script. Please let me know if this works properly.

nmigen-issue-migration commented 5 years ago

Comment by whitequark Monday Sep 23, 2019 at 18:29 GMT


For instance, I have a bunch of 250 MHz same frequency, unknown phase shift crossings that need to come out of reset within a few cycles of each other, so a 5 ns max delay works well there.

So in the diff above I hardcoded max_delay to 5 ns. This of course should not be hardcoded nor use a global override; it should be set per-synchronizer. However, one concern I have is how it should be set. You are using 5 ns, but you are saying that you want the delay to be no longer than several cycles. So, should this perhaps be derived from the clock? Could set_multicycle_path actually be more appropriate for this case? Should we support both?

nmigen-issue-migration commented 5 years ago

Comment by jordens Monday Sep 23, 2019 at 18:31 GMT


Do I understand it correctly that set_max_delay reduces (such that it meets the constraint) the uncertainty in phase between different synchronizers?

It limits the skew and allows for metastability resolution. Between two flip flops in a synchronizer chain in the target clock domain, there is t_clk-t_delay time to resolve metastability. Ergo the delay must be as short as possible and can not be unlimited.

nmigen-issue-migration commented 5 years ago

Comment by whitequark Monday Sep 23, 2019 at 18:34 GMT


It limits the skew and allows for metastability resolution. Between two flip flops in a synchronizer chain in the target clock domain, there is t_clk-t_delay time to resolve metastability. Ergo the delay must be as short as possible and can not be unlimited.

OK. Do I understand correctly that there are two independent purposes served by set_max_delay in this use case then? So even if you don't care about skew, you must limit the delay anyway, if you want the synchronizer to work.

nmigen-issue-migration commented 5 years ago

Comment by dlharmon Monday Sep 23, 2019 at 18:44 GMT


It limits the skew and allows for metastability resolution. Between two flip flops in a synchronizer chain in the target clock domain, there is t_clk-t_delay time to resolve metastability. Ergo the delay must be as short as possible and can not be unlimited.

OK. Do I understand correctly that there are two independent purposes served by set_max_delay in this use case then? So even if you don't care about skew, you must limit the delay anyway, if you want the synchronizer to work.

I have personally just used ASYNC_REG="TRUE" for constraining the latter flip flops in synchronizers. It's supposed to pack them all into a single CLB and minimize timing delays as much as possible. Some do suggest doing set_max_delay on these as well with a small time like 3 ns. Vivado seems to do the right thing without it. One thing to watch out for is that the time set must be sufficiently short for the fastest clock used.

Edit: The first flip flop certainly does need set_max_delay.

I'll try your patch shortly. It looks promising.

nmigen-issue-migration commented 5 years ago

Comment by dlharmon Monday Sep 23, 2019 at 19:00 GMT


It looks like you can't use arbitrary Tcl commands in XDC files (AR#54842), so I moved it to the main script. Please let me know if this works properly.

I merged this with my local copy and built a bitstream. The timing report and constraints look good. I like how this works. Thanks.

nmigen-issue-migration commented 5 years ago

Comment by whitequark Monday Sep 23, 2019 at 19:07 GMT


OK, so now the question is how to make the delay configurable. Here's my proposal:

Opinion?

nmigen-issue-migration commented 5 years ago

Comment by dlharmon Monday Sep 23, 2019 at 19:44 GMT


OK, so now the question is how to make the delay configurable. Here's my proposal:

* `max_input_delay` becomes an argument to the constructor of `FFSynchronizer`/`ResetSynchronizer`;

* It is `None` by default;

* If it is not `None` and it is not supported by the toolchain, an exception is raised.

Opinion?

That seems fine.

nmigen-issue-migration commented 5 years ago

Comment by dlharmon Monday Sep 23, 2019 at 19:49 GMT


Allow the option of a full false path, not just hold only:


diff --git a/nmigen/vendor/xilinx_7series.py b/nmigen/vendor/xilinx_7series.py
index 953229a..62ab94a 100644
--- a/nmigen/vendor/xilinx_7series.py
+++ b/nmigen/vendor/xilinx_7series.py
@@ -79,7 +79,8 @@ class Xilinx7SeriesPlatform(TemplatedPlatform):
             {% endfor %}
             {{get_override("script_after_read")|default("# (script_after_read placeholder)")}}
             synth_design -top {{name}} -part {{platform.device}}{{platform.package}}-{{platform.speed}}
-            set_false_path -hold -to [get_cells -hier -filter {nmigen.vivado.false_path == TRUE}]
+            set_false_path -hold -to [get_cells -hier -filter {nmigen.vivado.false_path == HOLD_ONLY}]
+            set_false_path -to [get_cells -hier -filter {nmigen.vivado.false_path == TRUE}]
             foreach cell [get_cells -hier -filter {nmigen.vivado.max_delay != ""}] {
                 set_max_delay [get_property nmigen.vivado.max_delay $cell] -to $cell
             }
@@ -372,7 +373,7 @@ class Xilinx7SeriesPlatform(TemplatedPlatform):
                         reset=ff_sync._reset, reset_less=ff_sync._reset_less,
                         attrs={"ASYNC_REG": "TRUE"})
                  for index in range(ff_sync._stages)]
-        flops[0].attrs["nmigen.vivado.false_path"] = "TRUE"
+        flops[0].attrs["nmigen.vivado.false_path"] = "HOLD_ONLY"
         flops[0].attrs["nmigen.vivado.max_delay"]  = "5.0" # FIXME
         for i, o in zip((ff_sync.i, *flops), flops):
             m.d[ff_sync._o_domain] += o.eq(i)
@@ -385,7 +386,7 @@ class Xilinx7SeriesPlatform(TemplatedPlatform):
         flops = [Signal(1, name="stage{}".format(index), reset=1,
                         attrs={"ASYNC_REG": "TRUE"})
                  for index in range(reset_sync._stages)]
-        flops[0].attrs["nmigen.vivado.false_path"] = "TRUE"
+        flops[0].attrs["nmigen.vivado.false_path"] = "HOLD_ONLY"
         flops[0].attrs["nmigen.vivado.max_delay"]  = "5.0" # FIXME
         for i, o in zip((0, *flops), flops):
             m.d.reset_sync += o.eq(i)```
nmigen-issue-migration commented 5 years ago

Comment by whitequark Monday Sep 23, 2019 at 19:51 GMT


Allow the option of a full false path, not just hold only

The other way around?

nmigen-issue-migration commented 5 years ago

Comment by dlharmon Monday Sep 23, 2019 at 19:59 GMT


Allow the option of a full false path, not just hold only

The other way around?

My intent was to make it clear that it's a hold only false path in the FFSynchronizer and provide the possibility of making a setup + hold false path.