YosysHQ / yosys

Yosys Open SYnthesis Suite
https://yosyshq.net/yosys/
ISC License
3.3k stars 860 forks source link

cxxrtl: Verilog frontend's handling of `$meminit` breaks a lot of assumptions #2809

Open mwkmwkmwk opened 3 years ago

mwkmwkmwk commented 3 years ago

Consider the following file:

module top(...);
input [3:0] ra;
output [15:0] rd;
reg [15:0] mem[0:15];
assign rd = mem[ra];

integer i, j;
initial begin
        for (i = 0; i < 16; i++) begin
                j = i;
                mem[j] = j * j;
        end
end
endmodule

Because of how the Verilog frontend works (cannot fold the j = i assignment into the use in the memory initialization despite it being elaboration-time constant), this sample ends up with $meminit ADDR and DATA inputs connected to non-constants:

To get this cursed netlist anywhere near usable state for cxxrtl proper (ie. get the $meminit inputs actually connected to constants), the following passes need to be run:

[NOTE: it's also not hard to make a testcase that also emits an actual switch and thus requires proc_mux, but it also triggers a completely unrelated unsoundness in verilog frontend, so uhhh...]

Further, because opt_clean refuses to operate on modules that contain processes, it is necessary to actually run the whole of proc on any module that just so happens to have a non-const $meminit input.

(this is, by the way, why synth scripts have a proc; opt_expr; opt_clean sequence before actually doing anything interesting)

In practice, to make cxxrtl work with direct Verilog frontend output, this means that cxxrtl needs to check the incoming design for $meminit instances that have non-const inputs and, if found, run proc; opt_expr; opt_clean to fix them up. This will also, by necessity, blast away all unrelated processes and render -noproc ineffective, but I see no good way around this. At most, we can limit the effects to the modules that actually contain funny $meminit cells via selection.

By the way, prepare_design() also needs to be changed into a two-phase thing because there's a possibility that funny $meminit cells (or for that matter, reg init values via processes) will only appear in the design after the hierarchy call.

whitequark commented 3 years ago

Further, because opt_clean refuses to operate on modules that contain processes

Didn't I fix that recently?

mwkmwkmwk commented 3 years ago

Further, because opt_clean refuses to operate on modules that contain processes

Didn't I fix that recently?

Nope, opt_clean in current master still just skips over modules that happen to contain processes.

It's definitely possible to fix that one (doesn't even sound hard), but you still have the rest of this bug report to deal with...

whitequark commented 3 years ago

Maybe we should just emit all init rules as code in the class constructor?

mwkmwkmwk commented 3 years ago

Maybe we should just emit all init rules as code in the class constructor?

and deal with another schedule pass so that we evaluate all cells and processes involved in this mess in the right order? sounds quite horrible

whitequark commented 3 years ago

Cursed.

mwkmwkmwk commented 3 years ago

By the way, there's a second part of the horrible here:

module top(...);
input [3:0] ra;
output [15:0] rd;
reg [15:0] mem[0:15];
assign rd = mem[ra];

integer i, j;
initial begin
        for (i = 0; i < 16; i++) begin
                j = i;
                if (j & 1)
                        mem[j] = j * j;
        end
end
endmodule

In this case, the Verilog frontend cannot even determine if the $meminit is actually "executed", so it emits a process with a switch that, in the case the $meminit is not to be executed, effectively wires all-X address and data to it. Leaving aside the fact that all other passes are also currently broken by this (because I don't handle this case in kernel/mem.cc), this would require X handling in cxxrtl if evaluated in runtime, or else a "disabled' memory write would be interpreted as initializing address 0 with value 0, perhaps overwriting some actual valid data.

whitequark commented 3 years ago

This should really all be done in the frontend... but yeah, I understand how it's not.

mwkmwkmwk commented 3 years ago

Yeah, actually dealing with this in the frontend would be my preferred resolution to this issue (and #2447 for that matter; so far my attempts to solve this by introducing a $meminit_v2 just resulted in even more pain).

mwkmwkmwk commented 3 years ago

OK, so there are 3 ways I can see of solving this:

  1. the right way: fix the Verilog frontend to not emit $meminit with non-constant inputs
  2. the "fix it with a hammer" way: detect non-const $meminit, run proc; opt_expr; opt_clean if found
  3. the compromise way: hammer opt_clean and opt_expr to work properly in the presence of processes, make some kind of const-folding pass for processes that doesn't otherwise disturb things (perhaps proc_prune with extra kick?), wire them together in a loop (because of course we need a loop) as a new opt_constfold pass, run this when iffy $meminit is detected

2 is obviously feasible, and 3 probably so; not so sure about 1

zachjs commented 3 years ago

It seems like some kind of constant propagation somewhere in simplify() or some hack extension to ProcessGenerator could work, but I worry about the potential complexity. What if the constant is assembled in pieces, i.e., addr[9:0] = j; mem[addr] = ...;? What if it isn't trivially constant, i.e., it uses some variable updated conditionally within the loop?

My knowledge here is limited, so I can't really assess the tradeoffs. Am I missing something? What do you think?

mwkmwkmwk commented 3 years ago

I think the Verilog frontend already needs to have this power, or else it's a bug anyway: whatever address or data you provide to a memory initialization could, in theory, just as well be used as a loop bound for an inner loop (consider for (j = 0; j < addr; j = j +1) — if addr is suitable as an initialization address, it's also suitable here). So we just need to ask it to try harder to obtain this value.

mwkmwkmwk commented 3 years ago

I just don't know enough about how the frontend works (yet) to understand why it performs this work for loop bounds, but doesn't perform it for $meminit.

zachjs commented 3 years ago

it performs this work for loop bounds, but doesn't perform it for $meminit

Does it work for loop bounds? Something as simple as the following yields an error: 2nd expression of procedural for-loop is not constant.

module top;
    integer x, y, z;
    initial begin
        z = 1;
        y = 10;
        for (x = 0; x < y; x = x + 1)
            z = z * 2;
    end
endmodule

This is not to say I do not wish this worked, but rather that I do not think the pieces to make this work are already in place.

mwkmwkmwk commented 3 years ago

... argh, you have a point, I was testing a slightly different loop

I guess we have a problem, then.

whitequark commented 2 years ago

(this is, by the way, why synth scripts have a proc; opt_expr; opt_clean sequence before actually doing anything interesting)

An additional problem is that adding an opt_expr; opt_clean sequence will pessimize most designs when used with the CXXRTL backend by adding feedback arcs. So I'd say this alone is not a viable solution unless you like a 50% performance hit.

(This doesn't happen with opt_clean -purge, but that blasts away many nets that would be otherwise dumped into VCD, so it's not really a solution either.)