Closed ghost closed 8 years ago
Oh golly, looks like the synchronous things end up lose, too...
Testcases for the async latches. These three should produce /functionally/ equivalent output. inf_latc.v inf_latc2.v alatch.v
For the sync latches: These two should produce (mostly) identical netlists. target.v target_m.v (the second one has all the RD<=RD lines removed)
This patch fixes it all: latch_ff.patch (don't apply that other one. it's not sufficient.)
Looking at your patch, you appear to be synthesising a latch using a mux with a feedback loop. I'm not convinced this is the right thing to do - it could be susceptible to races and glitches. I suggest it would be better to add support for and synthesise to a LPM latch module (similar to the way flip-flops are handled).
Steve, what's your opinion on this?
If defined structurally, it's not easily detectable. (alatch.v uses other kinds of gates). How badly they glitch also depends on technology. I believe they can be synthesized later using a topo sort + cycle detect. May need a sat solver, too. There was a bit of code I had read a while ago, but I can't seem to find it again.... I think it was abc.cc in yosys. function handle_loops().
PS: If the async sets/resets don't work, you may need those two patches: https://github.com/klammerj/iverilog/commit/6fb77b17157e3180d97823254426cd4c0837fe3c https://github.com/klammerj/iverilog/commit/f29b0237d1e0a84e939c5dbbf1e3bb1a1624e9cc (I could not get them in earlier coz it broke a test. believed fixed now.)
If latches are to be supported, they should be supported directly and a new LPM device should be created. Seing how everybody is all about synchronous logic, it never occurred to me to actually create an LPM latch, so that will need to be added.
On 1/30/16 3:53 AM, martinwhitaker wrote:
Looking at your patch, you appear to be synthesising a latch using a mux with a feedback loop. I'm not convinced this is the right thing to do - it could be susceptible to races and glitches. I suggest it would be better to add support for and synthesise to a LPM latch module (similar to the way flip-flops are handled).
Steve, what's your opinion on this?
— Reply to this email directly or view it on GitHub https://github.com/steveicarus/iverilog/issues/89#issuecomment-177160242.
Steve Williams "The woods are lovely, dark and deep. steve at icarus.com But I have promises to keep, http://www.icarus.com and lines to code before I sleep, http://www.picturel.com And lines to code before I sleep."
Everybody is awfully inclusive! You can design and synthesize latch based register blocks that will save a significant amount of space. I had a design in the last decade on 180nm (for analog reasons) where a single define could switch the configuration registers between latches and flip-flops. It was implemented in latches because die area was a premium. Even farther out there some of us actually still design asynchronous state machines though they are usually embedded in analog blocks to prevent the normal digital designers from running away in terror. For me they are hand designed and verified in SPICE though I usually design the basic architecture, etc. in Verilog. Yes most people design using a synchronous methodology because there are tools to make that easier, but latches are supported by most full custom tools and are still part of a synchronous design. Though latches in the flow may not be as well tested.
Latches are often treated like second class citizens, but they do have a place in synchronous design and it's not just the example I described above.
nested ifs+syn ctx are still broken...
@martinwhitaker What I am trying to do for now is:
comb:
a-|\
Q-| |
b-| |----Q
Q-| |
|/
|
s
(just feedback)
sync:
a-|\ ___
Q-| | | |
b-| |----|D Q|
Q-| | >|CK |
|/ |___|
|
s
output Q fed back to input...
They're very similar and I believe that I can reuse the synth_async stuff.
steve may want sthg like this:
+async latch:
a-|\ ___
0-| | | |
b-| |----|D Q|
0-| | -|EN |
|/ |___|
|
s
A second mux to feed EN=0 for the not present inputs (0)
1 for the others...(of course this would glitch too...)
1-|\
0-| |
1-| |----EN
0-| |
|/
|
s
I just tried to fix the condit and casez synth, and I now get new regressions:
diff ./regression_report-devel.txt ./regression_report.txt
2019c2019
< ivlh_textio: Passed.
---
> ivlh_textio: ==> Failed - output does not match gold file.
2110c2110
< vhdl_textio_read: Passed.
---
> vhdl_textio_read: ==> Failed - output does not match gold file.
2143c2143
< case_wo_default: Passed.
---
> case_wo_default: ==> Failed - output does not match gold file.
2147c2147
< casesynth7: Passed - CE.
---
> casesynth7: ==> Failed - CE (no error reported).
2158c2158
< if_part_no_else: Passed.
---
> if_part_no_else: ==> Failed - output does not match gold file.
2172c2172
< synth_if_no_else: Passed.
---
> synth_if_no_else: ==> Failed - output does not match gold file.
2176c2176
< Total=2171, Passed=2165, Failed=6, Not Implemented=0, Expected Fail=0
---
> Total=2171, Passed=2159, Failed=12, Not Implemented=0, Expected Fail=0
The textio stuff happened before and the casesynth is expected, but the others are serious.
it's stuff like this:
localparam foo_default = 2'b00;
always @*
begin
foo = foo_default;
if (en0) foo[0] = in0;
if (en1) foo[1] = in1;
end
a previous assign... How do I figure out if there was one once I'm in the synth func of the next statement? And is there a way to do that without splitting everything in sync and async variants?
I think your second diagram is the proper solution - it can be applied either as a clock enable in the synchronous case or as a gate enable in the asynchronous case. I have an idea how to implement this - will try to find some time this weekend to try it out.
Your last code example is also a problem in the synchronous case, as noted in http://sourceforge.net/p/iverilog/bugs/993/
I would point out though that support for synthesis in iverilog is fairly limited and is not being actively developed. Have you tried yosys?
Not really. Guess I'll have to.... Any Idea if there will be a video available for yosys on the fosdem site? It seems that the encoding has stalled, or they've lost most stuff....
@caryr - I'm intrigued by you latches/flip-flips macro for configuration registers. Can you recommend any websites/books/whatever that discuss using latches for config registers? It's an idea I'd not run away in terror from cos I'd've never even entertained it!
Well, it turned out to be a very large can of worms, but synthesis should now give the correct (albeit not particularly optimised) results for synchronous processes for pretty much any nested if/case statement structure you care to throw at it. Adding support for latches should now be fairly straightforward.
Note that for asynchronous processes, synthesis will infer latches for casez/casex statements unless there's a default clause.
@caryr - the vlog95 target doesn't handle some of the new synthesis output. I'll try to take a look at this later in the week.
I'll try to write something up when I get some free time about the latch/flip-flop stuff, though that's not the scary stuff. It's the asynchronous control systems I have done.
Yes the vlog-95 output can be tricky because it has to translate the synthesized output into something that works with a normal Verilog simulator. The issue is often a case that was not supported so the special code needed to translate it is not there so it either asserts or skips things that it didn't know to look at. Once we get to latches we will also likely have to create a new UDP that models a synthesized latch like was done for the DFF. If you need any help or advice let me know. I plan to look at this now, but I don't have much time so we'll see how far I get.
Just that I get this right. when looking for a value to wire up:
if there's a value assigned use it, otherwise if accumulated_nex_in is driven, use it, otherwise create a loop. Is this roughly correct?
Also: I've noticed that the Nexus::drivers_present() ignores module inputs and inout ports. Is this a bug or don't they count as drivers?
Johann Klammer wrote:
Just that I get this right. when looking for a value to wire up:
if there's a value assigned use it, otherwise if accumulated_nex_in is driven, use it, otherwise create a loop. Is this roughly correct?
Sorry, I don't know what you are asking about. Please give some more context.
Also: I've noticed that the Nexus::drivers_present() ignores module inputs and inout ports. Is this a bug or don't they count as drivers?
I assume here you are talking about top level module ports? From a simulation point of view (and remember that iverilog is primarily a simulator), if the top level module has unconnected input ports, they should be treated as undriven. But this may be wrong for synthesis.
I'm not sure what the accumulated_nexus_out stuff is good for... nevermind... it seems gone now...
I've played with the latest fixes and did some debugging... I tried to get this working:
module top(
(* PAD="1,2,3,4"*)
input wire [3:0] D,
(* PAD="5"*)
input wire EN,
(* PAD="0,0,0,0"*)
output reg [3:0] Q
/* */);
always @(EN,D)
case(EN)
1'b1:
Q=D;
endcase
endmodule
here's my notes:
now to merge in the verilog upstream changes...
something's still out of sync. guess, I'll have to impl the LATCHEs myself...
ideally keep the tgt-atf/slick out of the source tree index...
(u_verilog for now)
but I need the files in there for building and make check..
(but do ivtest when check works ok)
debug: Enable emit debug
inf_latc2.v:10: warning: A latch has been inferred for 'Q'.
inf_latc2.v:10: sorry: Latches are not currently supported in synthesis.
inf_latc2.v:10: error: Unable to synthesize asynchronous process.
2 error(s) in post-elaboration processing.
now implemented NetLatch class and a bit in synth2.cc
the bitmask is still giving me problems... seems it has to be all-ones
to work... somehow it doesn't...
or is because one link floats? (has_floating_input())
(gdb) p nex_in.pin(0).dump_link(cerr,8)
Pin 0 of 6NetBus[top.], dir=PASSIVE
Pin 2 of 6NetMux[top._s5], dir=INPUT <- returns true for this..
wot's _s5?
NetMux (value)
pin3=D
pin2=unconn...(maybe alright...)
but why do I get for input?
I think this is ok, tho..
triggers the tie-off...
it is incomplete, tho'
just adds a pulldown... guess it should not float...
but the all_bits_driven...
merge_parallel masks.. does intersection...
may have2be union.. naah is alright...
he may have coded himself into a corner tho'
because he wants to know if all bits are driven, but that may differ per
statement, and is a bit impossible question to ask...
perhaps the real question to ask is:
are no bits partially driven?
that is, it's ok if some stmt drives none of them
or all of them but never a part.
that question doesn't need a bitfield tho'.. but other stuff does..
now I'm getting (mask.size() == 0)
gotta check the init...
bitmasks[0].size()
yep is zero...
synth2.cc:748
<------> mux_width[idx] = nex_map[idx].wid;
(is the expected 4 here)
921
949
(inits.. prolly missing them both...)
921 hit (all ones)
949 not hit
the merge_parallel_masks!!
(gdb) p top_mask.size()
$6 = 4
from
merge_parallel_masks(bitmasks[mdx], default_masks[mdx]);
init default_masks[mdx]
added...
+asserts..
the mux has still a floating input...
Top
case
assign
patches are appended 0001-NetLatch-class.patch.txt 0002-fixes-for-inf_latc2.v.patch.txt
PS: the mux does /not/ have a floating input. I was unwittingly looking at a stale .net file, because it failed another of my tests due to the added asserts for the bitmask sizes. (Still have to check Condit)
more of the same: 0001-fixes-dff2.v.patch.txt 0002-fix-target_m.v.patch.txt 0003-fixes-targetz_m.v.patch.txt
Also added LPM_LATCH to tgt interface: 0001-Add-LATCH-to-target-interface.patch.txt
I've fixed up the backend to use the new LPM, and my test complete now successfully... prev failed testcases follow.
dff2.v:
/*
* A D-type flip-flop to check synchronous logic works
* correctly.
*/
module dff(
(* PAD="14"*)
output wire q1,
(* PAD="15"*)
output wire q2,
(* PAD="16"*)
output wire q3,
(* PAD="17"*)
output wire q4,
(* PAD="18"*)
output wire f1,
(* PAD="19"*)
output wire f2,
(* PAD="20"*)
output wire f3,
(* PAD="21"*)
output wire f4,
(* PAD="3"*)
input wire d,
(* PAD="1"*)
input wire clk,
(* PAD="2"*)
input wire rstin,
(* PAD="4"*)
input wire ce,
(* PAD="5"*)
input wire oe);
(* PAD="AR"*)
wire rst;
assign rst=rstin;
reg q1a;
reg q2a;
reg q3a;
reg q4a;
buf(f1,q1a);//if there's a buf, fitter makes it combinatorial
buf(f2,q2a);//which means I get the feedback line here
buf(f3,q3a);
buf(f4,q4a);
//those two should reset to zero
notif1(q1,q1a,oe);
bufif1(q2,q2a,oe);
//those should reset to 1
notif1(q3,q3a,oe);
bufif1(q4,q4a,oe);
always @(posedge clk or posedge rst)
if (rst)
q1a <= 1'b1;
else
if(ce)
q1a <= d;
always @(posedge clk or posedge rst)
if (rst)
q2a <= 1'b0;
else
if(ce)
q2a <= d;
always @(posedge clk or posedge rst)
if (rst)
q3a <= 1'b0;
else
if(ce)
q3a <= d;
always @(posedge clk or posedge rst)
if (rst)
q4a <= 1'b1;
else
if(ce)
q4a <= d;
endmodule // dff
target_m.v
/*
somewhat inspired by ELMs PCI board.
I don't think this one works, though...
*/
module pci_tgt(
(*PAD="1"*)
input wire clk,
(*PAD="2,3"*)
input wire [0:1] MATCH,
(*PAD="4"*)
input wire RST,
(*PAD="5,6,7,8"*)
input wire [3:0] CBE,
(*PAD="9"*)
input wire IRDY,
(*PAD="10"*)
input wire FRAME,
(*PAD="?"*)
output wire DEVSEL,
(*PAD="23"*)
output wire TRDY,
(*PAD="?"*)
output reg RD,
(*PAD="?"*)
output wire CE,
(*PAD="AR"*)
output wire AR);
//reg S0;
//reg S1;
//reg S2;
//the state bits
reg [2:0]S;
//`define S {S0,S1,S2}
//reg [2:0]S;
//state machine states
//000 initial(to data,last,atar,busy from: busy,etar)
`define IDLE 3'b000
//101 data phase (this device) to etar,last from init,atar
`define DATA 3'b111
//010 busy(another device) from init, to init
`define BUSY 3'b100
//111 post-addr read-turnaround from init, to data,last
`define ATAR 3'b011
//110 turnaround after last cycle from busy to data
`define ETAR 3'b001
//commands:
//cbe[3::0]= 1100 read mult
`define READM 4'b1100
//cbe[3::0]= 0110 read
`define READ 4'b0110
//cbe[3::0]= 1110 read line
`define READL 4'b1110
//cbe[3::0]= 1111 write&inv
`define WRINV 4'b1111
//cbe[3::0]= 0111 write
`define WRITE 4'b0111
assign AR=~RST;
assign CE=~( (~IRDY) && (S==`DATA));
/*
those are a problem..
one ends up
on the ff, the other one with looped back comb signal
which one it is, is only controllable by putting a buffer between ff and
notif...
with the 150x it may end up wrongly feeding back the ext pin
22V10 can /only/ feed back int sig for FFs. may end up wrong
if I /want/ to fb the ext pin...
args: out,data,control
*/
notif1 (DEVSEL,S[2],((S==`DATA)||(S==`ATAR)||(S==`ETAR)));
notif1 (TRDY, S[2],((S==`DATA)||(S==`ATAR)||(S==`ETAR)));
/*
write
write:
inputs |outputs/state (15 nS later)
|
CLK FRM MATCH |TRDY DEVSEL OE REN S
1 1 | z z 1 1 IDLE
up 0 0 | 0 0 0 0 DATA
up ...DATA (0-n)
up 1 x | 1 1 1 1 ETAR
up 1 x | z z 1 1 IDLE
read
inputs |outputs/state (15 nS later)
|
CLK FRM MATCH |TRDY DEVSEL OE REN S
1 1 | z z 1 1 IDLE
up 0 0 | z z 1 1 ATAR
up x x | 0 0 0 1 DATA
up ...DATA (0-n)
up 1 x | 1 1 1 1 ETAR
up 1 x | z z 1 1 IDLE
what about fast b2b ?
*/
always @(posedge clk or posedge AR)
begin
if(AR)
begin
S <= 0;
RD <= 0;
end
else
begin
case (S)
`IDLE:
begin
if(~FRAME)
begin
if(2'b00==MATCH[0:1])
begin
case(CBE[3:0])
`READ:
begin
S<=`ATAR;
RD<=1;
end
`READM:
begin
// ;wait for turnaround;
S<=`ATAR;
RD<=1;
end
`READL:
begin
// ;wait for turnaround;
S<=`ATAR;
RD<=1;
end
`WRITE:
begin
// ;start data phase;
S<=`DATA;
RD<=0;
end
`WRINV:
begin
// ;start data phase;
S<=`DATA;
RD<=0;
end
default:
begin
S<=`BUSY;
RD<=0;
end
endcase
end
else
begin
// ;no match, ignore;
S<=`BUSY;
RD<=0;
end
end
else
begin
//nothing at all.. stay idle
S<=`IDLE;
RD<=0;
end
end
`BUSY:
begin
// ;busy->idle;
if (FRAME)
begin
S<=`IDLE;
end
else
begin
S<=`BUSY;
end
end
`DATA:
begin
// ;data phase;
// ;last cycle;
if (FRAME)
begin
if(IRDY)
begin
S<=`DATA;
end
else
begin
S<=`ETAR;
end
end
else
begin
S<=`DATA;
end
end
`ATAR:
begin
// ;turnaround->data(read) phase;
S<=`DATA;
end
`ETAR:
begin
// ;final turnaround -> idle;
S<=`IDLE;
end
default:
begin
S<=`IDLE;
end
endcase
end
end
endmodule
targetz_m.v
/*
somewhat inspired by ELMs PCI board.
I don't think this one works, though...
*/
module pci_tgt(
(*PAD="1"*)
input wire clk,
(*PAD="2,3"*)
input wire [0:1] MATCH,
(*PAD="4"*)
input wire RST,
(*PAD="5,6,7,8"*)
input wire [3:0] CBE,
(*PAD="9"*)
input wire IRDY,
(*PAD="10"*)
input wire FRAME,
(*PAD="?"*)
output wire DEVSEL,
(*PAD="23"*)
output wire TRDY,
(*PAD="?"*)
output reg RD,
(*PAD="?"*)
output wire CE,
(*PAD="AR"*)
output wire AR);
//reg S0;
//reg S1;
//reg S2;
//the state bits
reg [2:0]S;
//`define S {S0,S1,S2}
//reg [2:0]S;
//state machine states
//000 initial(to data,last,atar,busy from: busy,etar)
`define IDLE 3'b000
//101 data phase (this device) to etar,last from init,atar
`define DATA 3'b111
//010 busy(another device) from init, to init
`define BUSY 3'b100
//111 post-addr read-turnaround from init, to data,last
`define ATAR 3'b011
//110 turnaround after last cycle from busy to data
`define ETAR 3'b001
//commands:
//cbe[3::0]= 1100 read mult
`define READM 4'b1100
//cbe[3::0]= 0110 read
`define READ 4'b0110
//cbe[3::0]= 1110 read line
`define READL 4'b1110
//cbe[3::0]= 1111 write&inv
`define WRINV 4'b1111
//cbe[3::0]= 0111 write
`define WRITE 4'b0111
assign AR=~RST;
assign CE=~( (~IRDY) && (S==`DATA));
/*
those are a problem..
one ends up
on the ff, the other one with looped back comb signal
which one it is, is only controllable by putting a buffer between ff and
notif...
with the 150x it may end up wrongly feeding back the ext pin
22V10 can /only/ feed back int sig for FFs. may end up wrong
if I /want/ to fb the ext pin...
args: out,data,control
*/
notif1 (DEVSEL,S[2],((S==`DATA)||(S==`ATAR)||(S==`ETAR)));
notif1 (TRDY, S[2],((S==`DATA)||(S==`ATAR)||(S==`ETAR)));
/*
write
write:
inputs |outputs/state (15 nS later)
|
CLK FRM MATCH |TRDY DEVSEL OE REN S
1 1 | z z 1 1 IDLE
up 0 0 | 0 0 0 0 DATA
up ...DATA (0-n)
up 1 x | 1 1 1 1 ETAR
up 1 x | z z 1 1 IDLE
read
inputs |outputs/state (15 nS later)
|
CLK FRM MATCH |TRDY DEVSEL OE REN S
1 1 | z z 1 1 IDLE
up 0 0 | z z 1 1 ATAR
up x x | 0 0 0 1 DATA
up ...DATA (0-n)
up 1 x | 1 1 1 1 ETAR
up 1 x | z z 1 1 IDLE
what about fast b2b ?
*/
always @(posedge clk or posedge AR)
begin
if(AR)
begin
S <= 0;
RD <= 0;
end
else
begin
casez (S)
`IDLE:
begin
if(~FRAME)
begin
if(2'b00==MATCH[0:1])
begin
casez(CBE[3:0])
`READ:
begin
S<=`ATAR;
RD<=1;
end
`READM:
begin
// ;wait for turnaround;
S<=`ATAR;
RD<=1;
end
`READL:
begin
// ;wait for turnaround;
S<=`ATAR;
RD<=1;
end
`WRITE:
begin
// ;start data phase;
S<=`DATA;
RD<=0;
end
`WRINV:
begin
// ;start data phase;
S<=`DATA;
RD<=0;
end
default:
begin
S<=`BUSY;
RD<=0;
end
endcase
end
else
begin
// ;no match, ignore;
S<=`BUSY;
RD<=0;
end
end
else
begin
//nothing at all.. stay idle
S<=`IDLE;
RD<=0;
end
end
`BUSY:
begin
// ;busy->idle;
if (FRAME)
begin
S<=`IDLE;
end
else
begin
S<=`BUSY;
end
end
`DATA:
begin
// ;data phase;
// ;last cycle;
if (FRAME)
begin
if(IRDY)
begin
S<=`DATA;
end
else
begin
S<=`ETAR;
end
end
else
begin
S<=`DATA;
end
end
`ATAR:
begin
// ;turnaround->data(read) phase;
S<=`DATA;
end
`ETAR:
begin
// ;final turnaround -> idle;
S<=`IDLE;
end
default:
begin
S<=`IDLE;
end
endcase
end
end
endmodule
The downside of doing it like this, is that I get the mux leaking the enable into the latch input.
QX_1.L = (EN & DX_1);
QX_0.L = (EN & DX_0);
QX_2.L = (EN & DX_2);
QX_3.L = (EN & DX_3);
QX_1.LH = EN;
QX_0.LH = EN;
QX_2.LH = EN;
QX_3.LH = EN;
There was a bug in merge_parallel_masks, which I've now fixed. I've also added some notes at the top of synth2.cc to explain the strategy.
A lot of the fixes you needed were a result of adding the asserts in merge_parallel_masks. In fact a mask size of zero is valid, and is used to identify a vector that is completely undriven by the current substatement.
Although I could extract and apply the sections of your patches that are valid, I suspect that would cause you problems when you next merged. It might be easier if you regenerated your patches with just the changes needed to add support for LPM latches.
I've noticed later on and removed that assert again, replaced with an initialization for the other mask. There was also a missing bitmask init that broke the casez paths, so I've moved the bitmask init and some asserts earlier in the NetCase code. I believe the top mask has to be initialized?
ran the tests now, had to patch tgt-vvp...(also vlog95)
0001-add-latch-primitive-for-vlog95.patch.txt 0002-latch-for-vvp-output.patch.txt
the dffsynth9 and 10 still fail (maybe the runtime)
I've applied your patches that add support for LPM latches. I've also added support for latch devices in vvp.
The dffsynth9 and dffsynth10 failures are caused by your other patches. I've not applied these. As far as I can see, all your test cases now synthesise correctly, so these other changes are not needed.
As you have observed, synthesis generates redundant muxes when you use a case statement to model a latch. The simple workaround is to use an if statement instead.
Thank you. closing.
for incomplete if and case statements currently does not work. Testcase:
Here's a simple patch. http://members.aon.at/aklamme4/scratch/0001-Infer-Latches.patch