Closed tilk closed 4 years ago
Hello,
Small ROM memories are generally implemented in FPGA with combinational elements such as look-up tables, due quality of results (QoR) goals. Synchronous ROMs, or clocked ROMs, are implemented in the same way, but using flip-flops connected to either the address or the data out ports. Those are guidelines resulting from FPGA architectures and design goals. But users can guide the tools with certain attributes, when other option is really needed.
Why, your ROM is implemented as a chain of case statements by the Verilog frontend, and not by $memrd
cells?, it may be because:
$memrd
cells are intended to be transformed to dff
cells, or a read port of a technology specific BRAM (if the flop array meets the synthesis requirements). These cells are clocked (i.e, sequential constructs). Your ROM is a pure combinational model.I can prevent this from happening by using -nomem2reg, but it is said that this is dangerous.
Consider an initialisation image of zeros. Since there is no write port in the model, and the contents of the ROM are all zeros for all read addresses, what makes more sense is to either optimise that hardware from the design (constant propagation will end removing the circuit if it is a top level module), or connect the consumers of that circuit to a GND cells. In both scenarios, the result is equivalent to the design intention, and is optimised (forcing a number of LUTs/BRAM to implement that design will consume unwanted, unnecessary power and area).
The same optimisations applies if the initialisation data does not use certain bits of the whole address dimension. Those unused ports are optimised, and that migh be reflected in a reduced number of cells implementing the design. The nomem2reg
adds an attibute to the array that prevents these very util optimisations.
For some reason, if I initialize the memory with a for loop, memory cells are generated:
Unfurtunately, this circuit does not seems to infer what you think. Have you tried synthesize it instead of just reading it?
I would very like the second code to be interpreted as memory in DigitalJS. When it's not, I get poor performance in DigitalJS and can't use the memory inspection GUI.
I personally don't know DigitalJS, but this sounds as a problem of how that tool works, not related with Yosys.
I am sorry if the answer is very largem and If I explained things you already know. This is just for the sake of clarity.
Thank you for your answer. I have two questions now:
(* nomem2reg *)
attribute to get memory cells for the rom
module, saying it's just for the needs of simulation?initial
block with a for
loop for memory initialization. Yes, I did try to synthesize it, and yes, Yosys synthesizes it to memory cells. I do not understand why this example is treated differently by Yosys than the one with $readmemh
.Gotcha, I understand now,
DigitalJS is a circuit simulator, therefore you get a memory using initial for(i = 0; i < 16; i = i + 1) mem[i] = i;
.
This construct is not synthesizable . I was speaking of synthesis tasks all the time, since that is the primary use of Yosys (and that is Yosys goal, process/generate synthesizable models).
What you see in this issue is one of these famous synthesis/simulation deltas. You expect to have some hardware that behaves correctly in simulation, to what you implement in synthesis, but the result never matches. Here, the design is transformed due a synthesis optimisation opportunity. A simulator will not show that behavior. I think this is a good lesson for future digital design engineers. The DigitalJS tool does not show you this (empty module):
=== test ===
Number of wires: 19
Number of wire bits: 172
Number of public wires: 19
Number of public wire bits: 172
Number of memories: 0
Number of memory bits: 0
Number of processes: 0
Number of cells: 0 < -
^
|
In other words, the design with $readmemh
gets optimised at very early stage of Verilog to internal representation conversion, whereas the design using foor loop
does not. That's why you see different results. I would stick with the design with for loop
. If you are more comfortable with $readmem
, then it is fine to use the parameter nomem2reg
but having in mind why this things are happening.
For instance, this is not synthesizable, yet DigitalJS shows a model of it:
module foo (output logic [2:0] t);
integer idx;
initial begin
t = 0;
for (idx = 0; idx < 3'h7; idx++)
t = t + idx;
end
endmodule
You probably may find more cases where Yosys does synthesis transformations that will not be very simulation friendly. Limitations will be present when these optimisation cannot be disabled.
@dh73 I understand where you're coming from with this answer, but I respectfully disagree with your position here. Let me explain why.
If I understand correctly, DigitalJS is a simulator of, specifically, synthesizable logic. It even says this on the main button:
Therefore, although it is consuming Verilog code, and it is simulating that code, conceptually it is not like a Verilog event-driven simulator (using the simulation semantics), but a Verilog synthesizer that happens to feed the result into a netlist simulator (using the synthesis semantics) instead of a place-and-route tool.
DigitalJS is a circuit simulator, therefore you get a memory using
initial for(i = 0; i < 16; i = i + 1) mem[i] = i;
. This construct is not synthesizable . I was speaking of synthesis tasks all the time, since that is the primary use of Yosys (and that is Yosys goal, process/generate synthesizable models).
This is not correct. Although IEEE 1364.1 does not mention initial for
loops as an example, it does permit for
loops with statically computable bounds in §7.7.6:
Both vendor tools and Yosys explicitly accept this construct for memory initialization. In fact I have recently improved the handling of this construct in #1607.
Moreover, you can see that @tilk encounters the same undesirable behavior when using $readmemh
, which is definitely permitted for synthesis. If you use read_verilog -debug
you can see that the problematic behavior is exactly the same here.
Here, the design is transformed due a synthesis optimisation opportunity.
Based on a preliminary investigation I suspect this is an artifact of how ast/simplify
detects memory ports and not an optimization. I can actually imagine a few cases where this makes synthesis results worse by preventing conversion to LUTROM.
Because of all of the above, I believe this is a legitimate issue with Yosys that we should look into.
@tilk @dh73 This is actually an issue with how Yosys handles the SystemVerilog logic
type. If you replace logic
with reg
in the last example:
module test(input [3:0] addr, output [7:0] data);
reg [7:0] mem[0:15];
assign data = mem[addr];
integer i;
initial for(i = 0; i < 16; i = i + 1) mem[i] = i;
endmodule
then the expected $memrd
cells are produced.
Fixed in #2029.
Thank you for the detailed explanation!
I have a question about memory handling. The following SystemVerilog is interpreted by the frontend as a memory - the
$memrd
,$memwr
and$meminit
cells are generated:But if I remove the write port, this gets converted to separate values:
I can prevent this from happening by using
-nomem2reg
, but it is said that this is dangerous. Why is it happening? Is there something wrong with the second code, which prevents the use of memories?For some reason, if I initialize the memory with a for loop, memory cells are generated:
I would very like the second code to be interpreted as memory in DigitalJS. When it's not, I get poor performance in DigitalJS and can't use the memory inspection GUI.