amaranth-lang / amaranth

A modern hardware definition language and toolchain based on Python
https://amaranth-lang.org/docs/amaranth/
BSD 2-Clause "Simplified" License
1.57k stars 174 forks source link

Please consider an optional reset on MultiReg #44

Closed nmigen-issue-migration closed 5 years ago

nmigen-issue-migration commented 5 years ago

Issue by Wren6991 Wednesday Feb 27, 2019 at 17:34 GMT Originally opened as https://github.com/m-labs/nmigen/issues/37


I was browsing the code to learn the standard library, and bumped into this in cdc.py:

class MultiReg:
    def __init__(self, i, o, odomain="sync", n=2, reset=0):

     ...

        self._regs = [Signal(self.i.shape(), name="cdc{}".format(i),
                             reset=reset, reset_less=True, attrs={"no_retiming": True})
                      for i in range(n)]

I understand there are caveats with resets on CDC capture flops, but sometimes you need it. For example, a pulse-to-pulse synchroniser (or toggle synchroniser) contains a toggling NRZ signal crossing the domains, and the capture flops need a capture-domain reset to avoid spurious toggling observed by the logic when deasserting reset.

platform.get_multi_reg(self) may contain considerable magic for some synthesis flows, and should only need writing once; ideally, something like a pulse-to-pulse synchroniser should be implementable using the existing MultiReg.

A ligh-touch option could be to just add a reset_less=True to MultiReg.__init__, and pass this down.

Happy to be argued with, told to go implement it, or just told I'm wrong :)

nmigen-issue-migration commented 5 years ago

Comment by Wren6991 Wednesday Feb 27, 2019 at 17:34 GMT


On a slightly-related note, are there any plans to flesh out the CDC library with e.g. pulse-to-pulse, Gray code, or ready-ack + multibit? Batteries included!

nmigen-issue-migration commented 5 years ago

Comment by whitequark Wednesday Feb 27, 2019 at 17:41 GMT


I understand there are caveats with resets on CDC capture flops, but sometimes you need it. For example, a pulse-to-pulse synchroniser (or toggle synchroniser) contains a toggling NRZ signal crossing the domains, and the capture flops need a capture-domain reset to avoid spurious toggling observed by the logic when deasserting reset.

Weak yes. I will need to dig in and understand the context here much better. You can help this by providnig compelling real-world examples and extensive documentation. I.e. teach me the use cases.

On a slightly-related note, are there any plans to flesh out the CDC library with e.g. pulse-to-pulse, Gray code, or ready-ack + multibit? Batteries included!

Yes, all this was in Migen and it will get ported. Gray counters are already in.

nmigen-issue-migration commented 5 years ago

Comment by Wren6991 Wednesday Feb 27, 2019 at 19:50 GMT


Weak yes. I will need to dig in and understand the context here much better. You can help this by providnig compelling real-world examples and extensive documentation. I.e. teach me the use cases.

Okay I will try! I don't guarantee any of the below code works, but it's conceptually sound

Pulse to pulse is used to signal the occurrence of some event from one domain to another. The idea is that each 1-clock-wide pulse on the input results in a 1-clock wide pulse on the output, even when the clocks are different speeds. For slow->fast you can do 100% duty cycle at the input, otherwise you require some level of sparseness at the input to avoid dropping pulses.

One use for this is a credit-based DREQ on e.g. a UART with a DMA interface. If you signal a single pulse for each transfer you require, you can guarantee that a DMA, in the fast system clock domain, will never overflow your FIFO, in the slow peripheral clock domain. OTOH if you use the normal level-sensitive DREQ scheme, you need to deassert the DREQ before completely filling your FIFO, to account for the transfers which may be in-flight in the bus CDC between the DMA and the UART. This wastes FIFO space.

(You could alternatively use an async FIFO, but on most SoCs it's normal to run the entire peripheral bus at a much lower frequency than your main interconnect, to save power etc)

In Verilog it looks something like this:

module sync_pulse (
    input wire i_clk,
    input wire i_rst_n,
    input wire i_pulse,

    input wire o_clk,
    input wire o_rst_n,
    output wire o_pulse
);

// Generate toggle on pulse in launching domain

reg i_toggle;

always @ (posedge i_clk or negedge i_rst_n)
    if (!i_rst_n)
        i_toggle <= 1'b0;
    else if (i_pulse)
        i_toggle <= !i_toggle;

// Cross the toggle signal

wire o_toggle;

// equivalent to MultiReg
sync_1bit (
    .clk   (o_clk),
    .rst_n (o_rst_n),
    .i     (i_toggle),
    .o     (o_toggle)
);

// Generate pulse on toggle in capturing domain

reg o_toggle_prev;

always @ (posedge o_clk or negedge o_rst_n)
    if (!o_rst_n)
        o_toggle_prev <= 1'b0;
    else
        o_toggle_prev <= o_toggle;

assign o_pulse = o_toggle != o_toggle_prev;

endmodule

Without a reset on the MultiReg, you may randomly get a pulse on o_pulse when o_rst_n is removed.

If you need full-duty-cycle at the input for fast-slow, and you need backpressure, then one tool is a credit FIFO, which is an async FIFO with no data path. It's used kind of like a semaphore which you up() in one clock domain and down() in the other. This might be a handy special-case of the existing async fifo!

Another trick which requires reset on a MultiReg is a req-ack handshake. This allows a multi-bit signal to cross domains safely, but with some delay and infrequent updates. For example, suppose you have an RTC which is clocked off some fixed slow clock, and you want to sample a raw counter value from a processor whose clock varies.

module sync_sample #(
    parameter W_DATA = 8
) (
    input wire i_clk,
    input wire i_rst_n,
    input wire [W_DATA-1:0] i_samp,

    input wire o_clk,
    input wire o_rst_n,
    output reg [W_DATA-1:0] o_samp
);

reg [W_DATA-1:0] samp_cross;

reg i_req;  // i -> o
wire i_ack; // i <- o

reg o_ack;  // o -> i
wire o_req; // o <- i

// Launching domain

sync_1bit (
    .clk   (i_clk),
    .rst_n (i_rst_n),
    .i     (o_ack),
    .o     (i_ack)
);

always @ (posedge i_clk or negedge i_rst_n) begin
    if (!i_rst_n) begin
        samp_cross <= {W_DATA{1'b0}};
        i_req <= 1'b0;
    end else begin
        if (i_req && i_ack) begin
            i_req <= 1'b0;
        end else if (!(i_req || i_ack)) begin
            i_req <= 1'b1;
            samp_cross <= i_samp;
        end
    end
end

// Capturing domain

sync_1bit (
    .clk   (o_clk),
    .rst_n (o_rst_n),
    .i     (i_req),
    .o     (o_req)
);

always @ (posedge o_clk or negedge o_rst_n) begin
    if (!o_rst_n) begin
        o_samp <= {W_DATA{1'b0}};
        o_ack <= 1'b0;
    end else begin
        o_ack <= o_req;
        if (o_req && !o_ack)
            o_samp <= samp_cross;
    end
end

endmodule

The idea with this one is that you avoid metastability and bit skew by guaranteeing that samp_cross doesn't transition for two o_clks before it is sampled in the o_clk domain. You don't have multiple flops on the output of samp_cross, and in fact these would be harmful. Metastabilities still occur inside of the sync_1bit modules, but that's safe due to the way the handshake progresses (it's effectively a 2-bit gray counter with one bit in each clock domain).

An async FIFO is unsuitable here: you are interested in sampling a single value of the signal now-ish, not reading in some buffered sequence.

Note that this is robust wrt the ordering of reset removal:

However, if there is no reset on the MultiReg, there may be a period after reset removal where the synchronisers contain garbage, and the req and ack signals will toggle somewhat randomly. This may cause samp_cross to transition too close to being captured by o_samp, which is fatal. It's cured by giving the block a few clocks whilst it's in reset, but still a dangerous footgun.

Sorry for the verbosity :( this (PDF) is a fairly concise document that gives a bit of a who's who of clock crossing tricks, and has diagrams.

nmigen-issue-migration commented 5 years ago

Comment by Wren6991 Thursday Feb 28, 2019 at 09:19 GMT


Yes, all this was in Migen and it will get ported. Gray counters are already in.

Okay, noted, I will go check out the Migen standard library and COMPAT_SUMMARY.md before bothering you further :)

Would working on the libraries be helpful, or are people mostly focusing on the core language at the moment? I would be happy to put time into implementing missing features from Migen, or verifying/reviewing the parts that currently exist in nMigen but are unverified. I saw that nMigen has some first-class FV support, which is one reason for my interest!

nmigen-issue-migration commented 5 years ago

Comment by Wren6991 Sunday Mar 03, 2019 at 16:02 GMT


Reading back I think I misunderstood what you asked for, and that you already knew 99% of what I said. That's pretty bad, sorry.

However, even if e.g. the Migen PulseSynchroniser was FPGA-proven, there are ways for this kind of issue to hide:

None of these are universally true. Also note page 17 on the Cummings paper referenced in nmigen/fifo.py: he explicitly resets the pointer sync registers, whereas the AsyncFIFO class uses a non-resettable MultiReg.

nmigen-issue-migration commented 5 years ago

Comment by whitequark Sunday Mar 03, 2019 at 18:30 GMT


Would working on the libraries be helpful, or are people mostly focusing on the core language at the moment? I would be happy to put time into implementing missing features from Migen, or verifying/reviewing the parts that currently exist in nMigen but are unverified. I saw that nMigen has some first-class FV support, which is one reason for my interest!

Working on the libraries is definitely helpful! There's a lot of stuff that needs to be ported from Migen, documentation improved, etc. I will likely have opinions on how it should be implemented, but help is very welcome here!

Reading back I think I misunderstood what you asked for, and that you already knew 99% of what I said. That's pretty bad, sorry.

That's OK. I will now try to summarize what you said. "MultiReg is used for safely sampling signals across clock domains. When the sampled value is used within a sequence of events triggered by something else (which does not happen shortly after a reset), this is OK. But when the sampled value is used itself to trigger other events, this can result in misfires." Is this correct?

nmigen-issue-migration commented 5 years ago

Comment by Wren6991 Sunday Mar 03, 2019 at 21:13 GMT


Working on the libraries is definitely helpful! There's a lot of stuff that needs to be ported from Migen, documentation improved, etc. I will likely have opinions on how it should be implemented, but help is very welcome here!

Okay cool! I will start digging into Migen a little bit more. There are plenty of other things which are ubiquitous but kind of tricky, like clock control.

Opinionated design is good!

That's OK. I will now try to summarize what you said. "MultiReg is used for safely sampling signals across clock domains. When the sampled value is used within a sequence of events triggered by something else (which does not happen shortly after a reset), this is OK. But when the sampled value is used itself to trigger other events, this can result in misfires." Is this correct?

Yes you summed that up much better than I could. MultiReg is dangerous if the output value is observed immediately after reset.

nmigen-issue-migration commented 5 years ago

Comment by whitequark Sunday Mar 10, 2019 at 03:22 GMT


Thanks. Could you send a PR for reset on MultiReg, together with an explanation of the cases where it would be useful, perhaps excerpted from the above? I would appreciate it.

In general, explaining how to use nMigen primitives soundly is very important, and all documentation work on that front is greatly appreciated. You are also good at following existing style, so I feel comfortable knowing that the patches you submit would be quick to review, which I'm very grateful for, given the amount of workload and backlog I have.

nmigen-issue-migration commented 5 years ago

Comment by Wren6991 Tuesday Mar 19, 2019 at 00:31 GMT


Sorry, I spaced out for an entire week there. Brain has now rebooted. Will add a PR!