YosysHQ / yosys

Yosys Open SYnthesis Suite
https://yosyshq.net/yosys/
ISC License
3.3k stars 860 forks source link

peepopt shiftadd: Only match for sufficiently small constant widths #4448

Open georgerennie opened 2 weeks ago

georgerennie commented 2 weeks ago

This addresses issue #4445. There are two issues from that, one being that casting a constant of arbitrary size to an int can overflow, and the other being that the shiftadd peephole optimization creates logic that can be very large for large shift constants.

This is a simple fix to both that just disables the optimization for constants over a certain bitwidth. I have arbitrarily chosen 24 bits here but am happy to change that or discuss a more fine-grained solution than just disabling the optimization. I suspect most user code will not have offsets outside of this range.

phsauter commented 2 weeks ago

As mentioned in the issue: Verilog-2005 specifies in 4.3.1 the following: The smallest allowed upper limit on the size of a vector (number of wires/bits) is 65536 ($2^{16}$).

So a $2^{24}$ limit is certainly a good general choice.
Since this is only an optimization and it works without it, it would also be legal to select a smaller upper limit, this would be a good idea if we think it could degrade performance (of the circuit or the synthesis process).

In this particular case I could see it degrading the synthesis process in particular during techmap.
Shift operations of the form data >> ... -c are transformed into a padded data instead. This is fine in isolation but then during techmap it seems to first build the full barrel shifter and only then optimize it using the constant input signals (padding of data). With a shift of $2^{24}/24-1$* this would create tens of millions of $_MUX_ cells and then optimize them away.
I don't know how big of an impact this has on the runtime and peak memory.

The 'nice' way to solve this would probably involve mapping shift operations first to $mux instead and then during the simple-mapping of $mux it is possible to check if A==B and if this is the case, skip creating this $_MUX_ cell and instead just connect the wires.
The simpler way is to measure the performance impact of a max-size shift, its currently running.

* something in techmap checks the expression width and limits it to 24-bit apparently, hence a smaller limit applies or we run into an error:

ERROR: Expression width 50331679 exceeds implementation limit of 16777216!