Closed RobertBaruch closed 1 year ago
What you're hitting is definitely a bug. I'll need a bit of time to dig into it and figure out exactly what's happening.
I'm grateful that you're looking into this. Thank you!!! <3
BTW, I suspect that what should be happening is that Past, Fell, Rose, Stable, and Sample should be forbidden in the combinatorial domain, based on Claire's documentation for SymbiYosys:
•
always @(posedge <clock>)
The second form of immediate assertion is one within a clocked always block. This form of assertion is required when attempting to use the$past, $stable, $changed, $rose
, or$fell
SystemVerilog functions discussed in the next section.
As for Past relative to local clocks, I think I should be using the gclk
attribute is called for, as in the SymbiYosys docs:
Accessing the formal timestep becomes important when verifying code in any asynchronous context... All of this requires the multiclock on line in the SBY options section. It also requires the
(* gclk *)
attribute. To use(* gclk *)
, define a register with that attribute...
I remember that there was a way to specify signal attributes in nMigen, but I can't for the life of me remember how...
@RobertBaruch You may find this helpful as to "how does the presence/lack of multiclk
interact with a single/multiple clocks"?
I don't know offhand if it helps with your specific problem, but I do remember having a lot of qs about multiclk
so I wrote down my notes.
@cr1901 It's pretty much what I concluded. At issue here is, I think, nMigen, not SymbiYosys, although I could be wrong.
BTW, I replaced m.d.comb
with m.d.sync
for the cover and asserts, and then I generated the verilog via generate -t v
, and manually edited the file to include (* gclk *)
just before input clk;
. The thing then passed verification under multiclock on
, as expected.
I replaced
m.d.comb
withm.d.sync
for the cover and asserts
Past
doesn't really work inside m.d.comb
, since it takes the domain for the inner register from m.d.<domain>
. That's either a part of this bug, or its entirety. I thought I emitted a diagnostic for that, but evidently not.
Yes, that was also my conclusion. So aside from outputting that diagnostic, the second issue here is that we need a way to specify the gclk
annotation for the default sync
domain. If I could do that without editing the output file ;) I would be unblocked.
Here is the updated python example. Note I replaced one of the Past
s with Past(..., 2)
which is correct when using the global verification clock.
from nmigen import Signal, Module, Elaboratable, ClockDomain, Mux, ClockSignal
from nmigen.build import Platform
from nmigen.asserts import Assert, Fell, Past, Cover
from nmigen.cli import main_parser, main_runner
# Simple example that clocks data in into data out when the
# independent signal -- assumed to be asynchrnous with any
# other clock in the system -- has a negative edge.
#
# Run with:
#
# python example.py generate -t il > toplevel.il
# sby -f example.sby
class Example(Elaboratable):
def __init__(self):
self.data_in = Signal(32)
self.data_out = Signal(32)
self.le = Signal()
def elaborate(self, _: Platform) -> Module:
m = Module()
internal_clk = ClockDomain("internal_clk", clk_edge="neg", local=True)
m.domains.internal_clk = internal_clk
internal_clk.clk = self.le
m.d.internal_clk += self.data_out.eq(self.data_in)
return m
def formal():
"""Formal verification for the example."""
parser = main_parser()
args = parser.parse_args()
m = Module()
m.submodules.example = example = Example()
m.d.sync += Cover((example.data_out == 0xAAAAAAAA) & (example.le == 0)
& (Past(example.data_out) == 0xBBBBBBBB)
& (Past(example.le, 2) == 0))
with m.If(Fell(example.le)):
m.d.sync += Assert(example.data_out == Past(example.data_in))
main_runner(parser, args, m, ports=[
example.data_in,
example.le,
])
if __name__ == "__main__":
formal()
Maybe put setattr -set gclk 1 w:clk
in your Yosys script for the time being?
Hmm, did I put it in the right place? BMC is still failing:
[tasks]
cover
bmc
[options]
bmc: mode bmc
cover: mode cover
depth 10
multiclock on
[engines]
smtbmc z3
[script]
read_ilang toplevel.il
prep -top top
setattr -set gclk 1 w:clk
[files]
toplevel.il
Sorry, I forgot you're in a design context. Try setattr -set gclk 1 top/w:clk
.
Nope, still fails.
Can you upload your IL so I can directly try with that?
Indeed!
I think the command I suggested works correctly; it's more that the annotation doesn't seem to help. Try adding attribute \gclk 32'1
before wire width 1 input 2 \clk
in the IL file; that still doesn't pass BMC for me.
Hmm. Here's the .v file if that helps, along with the sby for that. Adding (* gclk *)
before the clk signal causes BMC to pass.
[tasks]
cover
bmc
[options]
bmc: mode bmc
cover: mode cover
depth 10
multiclock on
[engines]
smtbmc z3
[script]
read -formal toplevel.v
prep -top top
[files]
toplevel.v
Yeah, I can reproduce that. It's really confusing! Are you sure it passes correctly when you go through Verilog?
I'm not sure how I could tell. I do know that the Cover trace looks absolutely correct.
Ok so this is something about Verilog. This is incredibly stupid, but works:
[tasks]
cover
bmc
[options]
bmc: mode bmc
cover: mode cover
depth 10
multiclock on
[engines]
smtbmc z3
[script]
read_ilang toplevel.il
setattr -set gclk 1 top/w:clk
write_verilog toplevelx.v
design -reset
read_verilog -sv toplevelx.v
prep -top top
[files]
toplevel.il
Yikes. There are a lot of intermediate wires in the cover trace now... but I guess I'll take it? Oof!
Also, special thanks to you. I hope this does get fixed without having to do a rewrite, but in the meantime I am definitely unblocked. Thank you!!!
Yikes. There are a lot of intermediate wires in the cover trace now... but I guess I'll take it? Oof!
To be clear this isn't actually a solution I suggest. (Though I think if you use write_verilog -norename
the traces will become more bearable.) There's definitely an insidious bug somewhere.
Okay, I figured it out. The gclk
attribute causes the Yosys AST frontend to change the netlist such that $dff
cells connected to that clock become $ff
cells.
Try this:
[script]
read_ilang toplevel.il
prep -top top
clk2fflogic top/w:clk %co
No, that fails BMC :(
Oh yeah sorry, let me figure out why.
@RobertBaruch Ok well that took longer than expected...
[tasks]
cover
bmc
[options]
bmc: mode bmc
cover: mode cover
depth 10
multiclock on
[engines]
smtbmc z3
[script]
read_verilog <<END
module \$dff (CLK, D, Q);
parameter WIDTH = 0;
parameter CLK_POLARITY = 1'b1;
input CLK;
input [WIDTH-1:0] D;
output reg [WIDTH-1:0] Q;
\$ff #(.WIDTH(WIDTH)) _TECHMAP_REPLACE_ (.D(D),.Q(Q));
endmodule
END
design -stash dff2ff
read_ilang toplevel.il
proc
techmap -map %dff2ff top/w:clk %co*
prep -top top
[files]
toplevel.il
Oof. Well, it does work. This is still a workaround, though, right?
Indeed, and a remarkably gross one. Basically what you want is to do s/posedge \clk/global/
in the input RTLIL file (try it, that would work...) The (*gclk*)
attribute, $global_clock
and all that stuff does nothing whatsoever on RTLIL level.
However it's not entirely clear to me yet what is to be done on nMigen language level. Maybe the sync
clock domain on the formal platform should translate in a way that it uses the global clock.
@RobertBaruch Per discussion with @cr1901, the tentative nMigen platform level solution would be to avoid exposing the global clock at all, but instead add an assumption that at least one clock signal of every unbound (i.e. left to the mercy of the platform) clock domain changes state per global clock cycle.
Something like this Verilog pseudocode:
wire [N:0] clocks_curr = {clk1,clk2,clk3,...,clkN};
reg [N:0] clocks_prev;
always @($global_clock)
clocks_prev <= clocks_curr;
assume (|(clocks_curr^clocks_prev));
Hmm. I'm not sure I follow? In my example, really the only clock I had was that local clock, so it isn't exposed to the top-level module. The other clock was the default clock (sync
), which was tied to the global clock. So in the above, you can't assume that the local clock changes state on every global clock cycle. What I really wanted to write was something like:
m.d.global_clock += Assert(something something Past(something) Fell(something) something)
In my example, really the only clock I had was that local clock, so it isn't exposed to the top-level module. The other clock was the default clock (
sync
), which was tied to the global clock.
The local clock in your example is bound (i.e. you created a ClockDomain
for it). The sync
clock is unbound, and would end up being driven in the way I mentioned earlier. Does this make it clearer?
Oh -- I didn't realize what the difference was between a bound and unbound clock. Still not sure I follow the solution. How does this work in the different cases of multiclock off
and multiclock on
?
@RobertBaruch I'm still unsure about the problem you are seeing. However, since you're asking about multiclock on/off
I will try to explain the net effect of multiclock on
/multiclock off
:
multiclock on
treats all clock signals in the synchronous domains in your design as explicitly under the control of $global_clock
. Pretend your design has n
clock inputs. From the point of view of the proof, at each tick of $global_clock
the SMT solver can choose to toggle, 0, 1... n
of the clocks.
multiclock off
treats all clock signals as having an implicit active edge each time $global_clock
ticks. With multiclock off
, the inactive edge of all n
input clocks and derived clocks1 (IIRC) are abstracted away by the SMT proof b/c $global_clock
doesn't really have a concept of active/inactive edge. All clock domains have their active edge on each $global_clock
tick. This is almost certainly not what you want, so multiclock off
is basically inappropriate for all designs except the case of "single clock domain with only a single active edge."
When multiclock
is on
, because the clock signals are explicit inputs that can be altered at will by the SMT solver at each $global_clock
tick, an SMT solver can always2 generate a counterexample during k
-induction by holding all clocks at their current value in a valid state until k - 1
$global_clock
ticks have passed, and then toggling a clock which causes a transition into a failing state at the k
th $global_clock
tick.
A workaround is to make sure all the clocks under control of $global_clock
to toggle "eventually" using assume
s so the SMT solver doesn't have a choice but to enable forward-progress in a design. But we want a pretty lenient version of "eventually" to test a design under a variety of clocks whose periods and phases (edges) differ wrt to each other. The Verilog psuedocode @whitequark provided is one possible solution that looks like is lenient without allowing the SMT solver to just completely halt progress to generate a counterexample.
By derived clock, think of a clock divider module
in Verilog using FFs, for instance- both edges of the output clock are derived from the active edge of an input clock when a counter reaches specific values.
Unless your design passes k
-induction with k
=2? I'm not sure. The net effect of holding the clocks in a single state is to reduce the induction length to 2 by... just not transitioning out of the current state until the last possible second :D. That sounds like an unrealistic induction length in many cases though.
Haven't read your reply yet, but now I'm thinking that Past
et al. should be allowed in m.d.comb. I think in the RTLIL you're implementing Past
as a signal that, on every tick of the global clock, saves the value of the current signal:
thing = Signal()
past_thing = Signal()
m.d.sync += past_thing.eq(thing)
So, there is no reason not to allow Past
in comb:
m.d.comb += Assert(past_thing == thing)
@cr1901 I think I see what you're saying. The upshot is that with multiclock on
we want to ensure that at least one clock has a transition. I honestly thought that's what the solver actually did, without needing to be told.
@RobertBaruch Past
can be kinda screwy, so I wouldn't be surprised if there's a semantic difference between adding assume
s* (typo?) in your code snippet and wq's code snippet to implement "forward progress must happen".
The upshot is that with multiclock on we want to ensure that at least one clock has a transition.
Correct. SystemVerilog has an s_eventually
, to generalize "something must happen", but only the AIGER backend of sby
supports it, and last time I tried to understand "how do you implement the concept of 'something eventually happens' in a SAT solver?", I failed :).
I honestly thought that's what the solver actually did, without needing to be told.
You don't always need to tell the solver to do this, because sometimes even without such constraints on the clocks, the solver can't halt progress. See multiclk.v, which has no such assume
s yet works fine if multiclk
is on
.
^@whitequark This is something I didn't consider. I'm not sure if it's possible to autodetect "is the solver going to be cute and eliminate forward progress?".
Haven't read your reply yet, but now I'm thinking that
Past
et al. should be allowed in m.d.comb.
That's not possible. Semantically, Past
generates a register chain clocked by the domain of the lexically containing expression. I.e. m.d.<domain> += (... Past(...) ...)
clocks the FFs implicit in the Past
expression with <domain>
's clock. Within m.d.comb
there's no such domain, so there's nothing to clock the FFs with.
I'm not sure if it's possible to autodetect "is the solver going to be cute and eliminate forward progress?"
Why do we need to detect that?
This model seems somewhat broken by statements like with m.If(Past(...)):
, where there is no lexically containing expression. I guess, in this case, you should really need to pass the domain explicitly, like with m.If(Past(..., domain=<domain>)):
Also, I don't see much harm in allowing m.d.comb += Assert(Past(..., domain=<domain>))
, or even m.d.<domain1> += Assert(Past(..., domain=<domain2>))
, given that the domains are explicit. It just means that the Past FFs on one domain are sampled combinatorially, or sampled in another domain.
Maybe the formal clock could be accessed like with m.If(Fell(..., gclk=True)):
and m.d.comb += Assert(Past(..., gclk=True))
?
Internally, they would generate $ff
cells in this case.
This model seems somewhat broken by statements like
with m.If(Past(...)):
, where there is no lexically containing expression.
Yes. I lifted Past
from SVA's $past
, where the conditional is lexically contained in the clocked always
block, but of course this doesn't work in nMigen. That has been a mistake and should be fixed, but any change would be breaking existing code to some extent. (I think we can detect the breakage reliably and issue a deprecation diagnostic, so it's not catastrophic.)
Also, I don't see much harm in allowing
m.d.comb += Assert(Past(..., domain=<domain>))
, or evenm.d.<domain1> += Assert(Past(..., domain=<domain2>))
,
Yes. After you mentioned the issue with with m.If(Past(...)):
I think the best course of action is to ditch the current behavior where the domain is magically lifted from the lexically containing statement, make it default to sync
, issue a deprecation warning during the 0.4 cycle when a call to Past
is detected where the behavior would otherwise silently change.
Objections?
Maybe the formal clock could be accessed like
with m.If(Fell(..., gclk=True)):
andm.d.comb += Assert(Past(..., gclk=True))
?
Can you explain when you need to access the formal clock directly like that?
I admit that I don't have a use case myself. But someone else could, for instance, want to describe that an asynchronous input is actually synchronous to some external clock:
with m.If(not Rose(clk1, gclk=True)):
m.d.comb += Assume(Stable(s, gclk=True))
Of course, this is only with multiclock on
, otherwise I can just use any clock domain to access the formal clock, it seems.
I am a fan of allowing to specify the domain in Past
et al. This makes a great deal of sense to me, because now you know for certain which domain Past
is relative to.
I'm also uneasy not allowing access to the global clock. Suppose we have async signals in a multiclock design that do things like this:
glbl | | |
____________________
x __________________|
_______________________________
y _______|
What domain would I put in each Rose
in this assertion:
with m.If(Rose(x)):
m.d.comb += Assert(Rose(y, 2))
The reason I'm asking is that I think that nMigen formal should neither expose $ff
or global clock directly, nor make single-clock/multiclock simulation a choice, for several reasons:
multiclock off
with an nMigen design that has multiple domains is, arguably, unsound, as you just described.clk2fflogic
does—I've tried using it with Robert's example above and BMC still fails. The less we expose directly the better.I think we could perhaps define Rose(sig, domain="comb")
to mean "rose asynchronously". It seems reasonably intuitive and would be mapped to the global clock without tying us to the way Yosys calls/implements it.
I like it! So this would be the right way to write such an assertion:
with m.If(Rose(x, domain="comb")):
m.d.comb += Assert(Rose(y, 2, domain="comb"))
I have another fun use-case that works correctly with the sby file workaround, but I'm eager to try it with the proposed syntax. I've also attached toplevel.il for this.
from nmigen import Array, Signal, Module, Elaboratable, ClockDomain
from nmigen.build import Platform
from nmigen.sim import Simulator, Delay
from nmigen.asserts import Assert, Assume, Cover, Past, Stable, Rose, Fell, AnyConst, Initial
from nmigen.back import rtlil
from nmigen.hdl import Fragment
class AsyncMemory(Elaboratable):
def __init__(self, width: int, addr_lines: int):
assert width > 0
assert addr_lines > 0
assert addr_lines <= 16
self.addr = Signal(addr_lines)
self.n_oe = Signal()
self.n_wr = Signal()
self.data_in = Signal(width)
self.data_out = Signal(width)
# The actual memory
self._mem = Array(
[Signal(width, reset_less=True) for _ in range(2**addr_lines)])
def elaborate(self, _: Platform) -> Module:
m = Module()
# Local clock domain so we can clock data into the memory
# on the positive edge of n_wr.
wr_clk = ClockDomain("wr_clk", local=True)
m.domains.wr_clk = wr_clk
wr_clk.clk = self.n_wr
m.d.comb += self.data_out.eq(0)
with m.If(~self.n_oe & self.n_wr):
m.d.comb += self.data_out.eq(self._mem[self.addr])
with m.If(self.n_oe):
m.d.wr_clk += self._mem[self.addr].eq(self.data_in)
return m
@classmethod
def formal(cls):
m = Module()
m.submodules.mem = mem = AsyncMemory(width=32, addr_lines=5)
# Assume "good practices":
# * n_oe and n_wr are never simultaneously 0, and any changes
# are separated by at least a cycle to allow buffers to turn off.
# * memory address remains stable throughout a write cycle, and
# is also stable just before a write cycle.
m.d.comb += Assume(mem.n_oe | mem.n_wr)
with m.If(Fell(mem.n_oe)):
m.d.comb += Assume((mem.n_wr == 1) & (Past(mem.n_wr) == 1))
with m.If(Fell(mem.n_wr)):
m.d.comb += Assume((mem.n_oe == 1) & (Past(mem.n_oe) == 1))
with m.If(Rose(mem.n_wr) | (mem.n_wr == 0)):
m.d.comb += Assume(Stable(mem.addr))
m.d.comb += Cover((mem.data_out == 0xAAAAAAAA)
& (Past(mem.data_out) == 0xBBBBBBBB))
with m.If(mem.n_oe == 1):
m.d.comb += Assert(mem.data_out == 0)
# Pick an address, any address. We assert that unless
# it is written, its data will not change. To save the written
# data, we need to create a clock domain for it.
saved_data_clk = ClockDomain("saved_data_clk")
m.domains.saved_data_clk = saved_data_clk
saved_data_clk.clk = mem.n_wr
check_addr = AnyConst(5)
data = Signal(32)
saved_data = Signal(32)
m.d.saved_data_clk += saved_data.eq(data)
with m.If(Rose(mem.n_wr)):
m.d.comb += Assert(mem._mem[mem.addr] == Past(mem.data_in))
with m.If(Initial()):
m.d.comb += data.eq(mem._mem[check_addr])
m.d.comb += Assume(mem.n_wr == 1)
with m.Else():
with m.If(Rose(mem.n_wr) & (mem.addr == check_addr)):
m.d.comb += data.eq(mem._mem[check_addr])
with m.Else():
m.d.comb += data.eq(saved_data)
m.d.comb += Assert(saved_data == mem._mem[check_addr])
return m, [mem.addr, mem.data_in, mem.n_wr, mem.n_oe]
if __name__ == "__main__":
design, ports = AsyncMemory.formal()
fragment = Fragment.get(design, None)
output = rtlil.convert(fragment, ports=ports)
with open("toplevel.il", "w") as f:
f.write(output)
To understand why setattr -set gclk 1 top/w:clk
didn't work, I generated the RTLIL of the following Verilog:
module test_gclk(input a);
(* gclk *) reg gclk;
always @(posedge gclk)
assert($past(a));
endmodule
Then, after removing (* gclk *)
in the source file, dumped the RTLIL to a new file.
Sure enough, the following appear on the diff:
+ attribute \gclk 1
As we saw, it made no difference. However, there was also this, further on:
- sync posedge \gclk
+ sync global
I think this was missed, previously. It appears to mark the process as clocked by the formal clock. All flip-flops derived from this will already be the clock-less $ff, and will not be touched by clk2fflogic. I guess this could be used to drive the synchronous logic within the so-called "comb" clock domain.
Correct. You can also use always @($global_clock)
to achieve a similar result directly.
I guess this could be used to drive the synchronous logic within the so-called "comb" clock domain.
I think the Yosys naming here is quite confusing, although I can understand why it ended up this way. What Yosys calls the "global clock" should really be named "solver timestep", and what Yosys calls the "$ff
cell" should really be named in a way that reflects its true nature: establishing causal relationships. You can see this more clearly if you consider how latches could be translated for formal verification: the logic loop in the latch would be split by the $ff
cell, and the forward progress assumption would be extended by "D and Q of any latch are not the same".
In https://github.com/nmigen/nmigen/issues/526#issuecomment-723460118, @RobertBaruch wrote:
I am a fan of allowing to specify the domain in
Past
et al. This makes a great deal of sense to me, because now you know for certain which domainPast
is relative to.
Just in case you are not aware, it is already the case that Past
and friends accept a domain
parameter.
Otherwise, apologies for the misunderstanding.
Just in case you are not aware, it is already the case that
Past
and friends accept adomain
parameter. Otherwise, apologies for the misunderstanding.
Ah, I didn't realize that. So what is really wanted is that domain
also accepts "comb"
.
It seems I can write something like
m.d.comb += Assert(Past(signal) == 0)
. Ifmulticlock
is off in the sby file, what exactly isPast
?Here's an example. First,
example.py
:And, the sby file:
This fails BMC when multiclock is on, which kind of makes sense to me, if Past is relative to the last cycle of the global clock. In this case I probably shouldn't be using Past, because really what I want to assert is that
data_out
is equal to whatdata_in
was, whenle
was last high.In that case, do I have to effectively make my own version of Past for the internal clock of the example module?
BMC succeeds when multiclock is off, which also kind of makes sense to me, if signals are not allowed to change except on the positive edge of the global clock. However, now the cover trace looks distinctly odd:
data_out
did change on the negative edge of 'le'... but it also changed whenle
was stable. Is that because I've violated the assumption that there is only one clock in the system and therefore yosys does unspecified behavior?