Related to #4445 and #4448.
The padding created in shiftadd increases the input to the shift operations. During techmap the shift operation will be implemented as a logarithmic shifter with all $_MUX_ cells, even if it muxes between two identical inputs (constants). This is then later optimized away. For large constant shifts a lot of padding is added, significantly inflating the number of temporary $_MUX_ cells, thus increasing the runtime and memory usage.
I think this would also solve #4445 but you may still want to add #4448 since it deals with a internal representation limit and theoretically you could still run into it with only this PR (though you would now need a signal that is millions of bits wide).
Here some numbers (script and module below):
shift by 5k bits -> 840MB 50s
shift by 20k bits -> 3.5GB 340s
module top(
input logic signed [25:0] shift,
input logic [31:0] data,
output logic [31:0] out
);
always_comb begin
out = data[shift-20000+:16];
end
endmodule
read_verilog -sv test.sv
proc
opt_expr
opt_clean
wreduce
peepopt
opt_clean
techmap
stat
opt_clean
opt_expr
stat
Related to #4445 and #4448.
The padding created in shiftadd increases the input to the shift operations. During
techmap
the shift operation will be implemented as a logarithmic shifter with all$_MUX_
cells, even if it muxes between two identical inputs (constants). This is then later optimized away. For large constant shifts a lot of padding is added, significantly inflating the number of temporary$_MUX_
cells, thus increasing the runtime and memory usage.I think this would also solve #4445 but you may still want to add #4448 since it deals with a internal representation limit and theoretically you could still run into it with only this PR (though you would now need a signal that is millions of bits wide).
Here some numbers (script and module below):