tdene / adder_networks

Store of prefix tree adder HDL, diagrams, and implementation results
Apache License 2.0
6 stars 1 forks source link

Any cost rounding up to power of two? (question) #1

Open tcal-x opened 2 years ago

tcal-x commented 2 years ago

Would I expect a custom generated 13b adder for a given architecture (e.g. brent-kung) to be strictly better than using a 16b pregenerated adder from this library and hooking it up with the top 3 bits of output unconnected and the top 3 bits of each input wired to 1'b0?

tdene commented 2 years ago

If the synthesis tool does not optimize out the nodes corresponding to the top 3 bits of output, then yes, a custom generated 13b adder would be both smaller as well as faster than a 16b adder with shorted inputs and outputs.

tcal-x commented 2 years ago

Thanks @tdene , if we assume that the synth tool does do constant folding / simplification of the fixed-1'b0 inputs of a 16b imlpementation, and logic trimming of unused outputs, would we get the same topology as if we directly built the 13b adder?

What about in more extreme cases? Can we use a 32b adder implementation and expect to get 11b adder after trimming that's as good as if we directly build the 11b adder?

tdene commented 2 years ago

If the synthesis tool does correctly simplify constant logic, then yes, you would get the same topology as if directly building the 13b adder. This should apply in extreme cases as well.

The two main concerns are first, whether the synthesis tool actually behaves as desired, and second, the fact that a 13b / 11b adder that follows a given classic architecture (e.g brent-kung) is not optimal.

On the second point however, a 16b Brent-Kung is not optimal either. And sure, a 13b Brent-Kung might be slightly further away from the optimal 13b solution than a 16b Brent-Kung is from the optimal 16b solution, but in the end, this is all a problem of intelligent algorithmic exploration. Without such an algorithm, 13b Brent-Kung is about as good-enough as 16b Brent-Kung, so I would say there is not much point in concerning oneself with optimality in the absence of an optimization algorithm.

If such an optimization algorithm is being used though, I would advise going straight for the 11b / 13b adder. The space of useful 32b adders seems to be around 10^32, while the space of useful 11b adders is closer to 10^11.