Open dkillebrew-g opened 3 years ago
From what I gather, one algorithm is, in a nutshell, to multiply by the reciprocal. That is, instead of x / d
, do x * (1/d)
. Then replace 1/d
by a fixed point value that approximates (to a sufficient precision based on the bitwidth of x
and of the result) 1/d
, let's describe this rational number as p/q
. Because shifting is cheap, we require a power of two denominator (i.e., q
is a power of two). Thus, after multiplying x * p
, a right shift (by log2(q)
) is performed.
There are a few more details, see pp 137 of https://www.agner.org/optimize/optimizing_assembly.pdf
For example using the ceil_div function like:
ceil_div(x, 4)
, we get this optimized verilog:but if we change the usage to
ceil_div(x, 5)
we get a divide:Compilers for software have used various tricks to strength-reduce division by a known constant. I imagine that HW can use the same mathematical identities. Here are some articles on the subject: