Closed jypark1257 closed 1 month ago
Both the neural unit and the ibex_multdiv_fast module (excluding the slower multiplication handled by the mult_div_slow unit) share the same 4x16-bit multipliers for their computations. I just placed the multipliers within the ex_block scope, making it easier to direct the input and output signals to the appropriate blocks.
Ok, found ibex_multdivfast offloads the workloads to 4*16-bit multipliers using ports "ib*".
Thank you.
In the paper, the key idea is to allow multiple parallel operations by leveraging existing logic (the ibex's multipliers).
As i looked through the rtl, neural unit seems like using its own 4 * 16-bit multipliers but not ibex's multipliers (ibex_multdiv_fast or ibex_multdiv_slow).
am i correct?