Closed udif closed 1 month ago
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 94.21%. Comparing base (
ddc950d
) to head (3bed0dc
). Report is 77 commits behind head on master.
At the moment I'm also debugging another issue. I use ultraembedded's JPEG core for tests. I can run it against the whole core, or subsets of it. One subset with <700 nodes passes almost immediately. When I moved to a different subtree with ~2000 nodes, the program dies after a few minutes with a segmentation fault. I don't know if this is a complexity issue, or a deadlock triggered by the specific netlist, because it jumps from a sub-second run, to dying after a few minutes.
I noticed while looking at #985 that it takes a long time to process the netlist, so this may be related.
$ time ./build/bin/slang-netlist ../core_jpeg/src_v/jpeg_dht*
Top level design units:
jpeg_dht
Build succeeded: 0 errors, 0 warnings
real 0m6.035s
user 0m6.025s
sys 0m0.020s
If you run with --debug
you can see it spends a lot of time looking up variables that appear in the LUTs, eg:
NetlistVisitor.h:70: Edge decl lookup_input_i to ref lookup_input_i
NetlistVisitor.h:83: Edge ref lookup_input_i to ref y_ac_width_r
NetlistVisitor.h:70: Edge decl lookup_input_i to ref lookup_input_i
NetlistVisitor.h:83: Edge ref lookup_input_i to ref y_ac_width_r
...
netlist.cpp:227: Netlist has 1131 nodes and 54233 edges
It needs some more investigation as to why there are so many edges being created.
I just added a test (a single one, at least for the moment) to cover CombLoops. From my side I think the code is ready.
As proof of concept is good but i suppose that analysis is need to be more conservative. It seems to me that checking the graph on presence of cycles (without outermost verilog
context) is not quite enough for a good combinational(combinatorial) logic loops analysis. Because not every assignment belongs to combinational logic. There is also sequential and latch kinds of procedural logic.
For example in this simple code sample there is no combinational logic at all. It follows that there are also no combinational cycles (but PR solution reports that there is at least one combinational loop):
module top (input clk);
reg a;
reg b;
test t1(.x(a), .y(b), .clk(clk));
test t2(.x(b), .y(a), .clk(clk));
endmodule
module test(input reg x, output reg y, input clk);
always_latch begin
if (clk)
y <= x;
end
endmodule
Build succeeded: 0 errors, 0 warnings
Nodes: 30
Actual active Nodes: 30
Detected 1 combinatorial loop:
Path length: 8
1.sv:5:13: note: variable a assigned to
test t1(.x(a), .y(b), .clk(clk));
^
1.sv:13:9: note: variable x read from
y <= x;
^
1.sv:13:4: note: variable y assigned to
y <= x;
^
1.sv:5:20: note: variable b read from
test t1(.x(a), .y(b), .clk(clk));
^
1.sv:6:13: note: variable b assigned to
test t2(.x(b), .y(a), .clk(clk));
^
1.sv:13:9: note: variable x read from
y <= x;
^
1.sv:13:4: note: variable y assigned to
y <= x;
^
1.sv:6:20: note: variable a read from
test t2(.x(b), .y(a), .clk(clk));
I think it needs to be determined first which assignments are related to combinational logic but which aren't to make analysis more context sensitivity.
always_comb
and always @*
we see less and less the plain old always @(a or b or c..)
style that leads to unintentional latches).From a practical point of view, since I cannot safely determine when a latch is safe, I would rather have a false positive I can waive (we'll have to add a mechanism for that) rather than skip a warning.
Let me know when you think this is ready to land.
- From my experience, 99% of the designs do not use latches, except for clock gating cells, therefore if you encounter a latch it is more often than not a design mistake rather than intention (although with
always_comb
andalways @*
we see less and less the plain oldalways @(a or b or c..)
style that leads to unintentional latches).- Even when you do use latches, I don't think you can easily analyze the latch control (by a simple algorithm) to determine whether it is a combinatorial loop or not.
From a practical point of view, since I cannot safely determine when a latch is safe, I would rather have a false positive I can waive (we'll have to add a mechanism for that) rather than skip a warning.
Modern designs are't use latches it's true but what about D-triggers/flip-flops (which widely used in modern designs to separate combinational logic) and other sequential logics? Provided solution also warns on it.
For example there is simple flip-flop without comb loops:
module top (input rst, input clk);
wire D;
wire Q;
wire Qn;
dff d1(D, clk, rst, Q, Qn);
dff d2(Q, clk, rst, D, Qn);
endmodule
module dff (
input logic D, clk, rst,
output logic Q, Qn
);
always_ff @(posedge clk, posedge rst) begin
if (rst) begin
Q <= 0;
Qn <= 1;
end else begin
Q <= D;
Qn <= ~D;
end
end
endmodule
Modern designs are't use latches it's true but what about D-triggers/flip-flops (which widely used in modern designs to separate combinational logic) and other sequential logics? Provided solution also warns on it.
This is simple - your example simply triggered a bug. This was not supposed to happen...
The idea was to disconnect any node below always
or always_ff
statements with posedge
or negedge
in their sensitivity list.
Apparently, as your example shows, there are bugs. I will check this.
However, I have found more serious false positive issues:
module t;
wire x, y, z;
assign {z, y} = {~y, ~x};
endmodule
This will also trigger a comb loop warning. While this specific one might be solved (at the cost of complicating the netlist logic) It looks like the general problem will not be solved without a full synthesis engine. For example:
module t;
wire x, y;
assign z = ~y & y;
assign y = z;
endmodule
It remains to be seen if the scope of this feature would be useful given the current limitations (known possible false positives).
I would expect the netlist graph for assign {z, y} = {~y, ~x};
to be acyclic, so there may be a bug if that is not true.
I see your point though and agree that without full synthesis it's not possible to determine a true combinatorial loop.
The class of loops that slang-netlist
can report are comparable to those reported by Verilator as UNOPTFLAT`, which are still useful to know about.
I would expect the netlist graph for
assign {z, y} = {~y, ~x};
to be acyclic, so there may be a bug if that is not true.
A combinatorial loop is reported because each of the x
and y
nodes have edges to z
and y
, as {z, y}
is lumped into one assigment depending on y
and x
and is then split. slang-netlist
is not smart enough to detect the bitwise assignment and separate the z = ~y
and `y = ~x' assignments.
Maybe we should rename the option name to --possible-comb-loop
so that people set the correct expectations, and accept a small number of false positives.
slang-netlist is not smart enough to detect the bitwise assignment and separate the z = ~y and `y = ~x' assignments.
Apologies - I thought I had implemented that. My aim for slang-netlist
is to provide a datastructure that resolves all source-level connectivity to the bit level. Handling packed vectors is important for that, so I'll try and look into it.
With regards to combinatorial loops, how about just calling the option --report-cycles
and some description that these can potentially be combinatorial loops?
With regards to combinatorial loops, how about just calling the option
--report-cycles
and some description that these can potentially be combinatorial loops?
But I'm not merely reporting cycles - I'm actively ignoring netlist edges that terminate on a posedge/negedge nodes.
BTW, another feature I would like to implement at a later stage is an improvement to the path reporting mode, that will also report the number of clock edges between the source and destination paths. It would require defining the clock signals, and keeping a table of clock aliases, either those directly connected, or those passing through a clock gating module (you would define the clock gating module name, and the input/output ports). Today I'm running into situations where I have a memory access bus running through several sample stages across my SoC, and having something that counts the total latency would be useful for me.
But I'm not merely reporting cycles - I'm actively ignoring netlist edges that terminate on a posedge/negedge nodes.
Marking sequential edges is a generally useful feature of the tool, since it allows the netlist to be constrained to find combinatorial paths. This is how I implemented a previous similar tool.
BTW, another feature I would like to implement at a later stage is an improvement to the path reporting mode, that will also report the number of clock edges between the source and destination paths.
Nice, that would be useful.
(Btw. I've got no objections to this landing! We can always iron details out later.)
This is simple - your example simply triggered a bug.
I've fixed the bug, but I still got 2 issues:
It will take me a few more days to resolve these.
It seems the bugfix above (2c57b7d) solved all stability and corruption issues, but when running against the full core_jpeg
example I've used before, I ran into #993 .
At the moment, My code works great under clang-sanitize, where I've debugged it, but for some reasons fails under gcc-11, where there seems to be a memory corruption causing a segment violation as well as locking up the watch windows on vscode. I'm looking into this.
While I've fixed the crashes, it seems there are still other issues (lockups on some examples).
While I've fixed the crashes, it seems there are still other issues (lockups on some examples).
False alarm, I was just being impatient and stopped the test after a few seconds. It ended up taking a few more seconds than what I estimated.
Let me know when you think this is ready to land.
@MikePopoloski @jameshanlon I think it is ready from my side.
The following PR adds combinatorial loop detection to
slang-netlist
. The original algorithm can be found here:https://epubs.siam.org/doi/10.1137/0204007 Finding All the Elementary Circuits of a Directed Graph Johnson, Donald B SIAM Journal on Computing Vol. 4, Issue. 1 Mar 1975
I have taken a Java implementation from https://github.com/josch/cycles_johnson_meyer , forked it to modernize the Java, then ported the code to C++ and finally modified my generic C++ code to work with @jameshanlon 's code.
While the basic code works, there are lots of things that can be improved:
SourceLocation
andSourceManager
classes.Here is a sample input and output: