Improve multiplier performance
by reducing partial product summation to parallel wallace tree
See function ReduceBitsParallel.
(Martin, I have marked this as a draft pull request, I am still digging into cleaning up the multiplier & wanted to have a visible thread for the improvements).
Improve multiplier performance by reducing partial product summation to parallel wallace tree See function ReduceBitsParallel.
(Martin, I have marked this as a draft pull request, I am still digging into cleaning up the multiplier & wanted to have a visible thread for the improvements).