kuznia-rdzeni / coreblocks

RISC-V out-of-order core for education and research purposes
https://kuznia-rdzeni.github.io/coreblocks/
BSD 3-Clause "New" or "Revised" License
37 stars 16 forks source link

Fusing lower and upper multiplication into single operation #142

Open marek-bauer opened 1 year ago

marek-bauer commented 1 year ago

According to RISC-V documentation:

If both the high and low bits of the same product are required, then the recommended code sequence is: MULH[[S]U] rdh, rs1, rs2; MUL rdl, rs1, rs2 (source register specifiers must be in same order and rdh cannot be the same as rs1 or rs2). Microarchitectures can then fuse these into a single multiply operation instead of performing two separate multiplies.

Good idea would be to implement this optimization in our mul_unit.py. It should be a quite easy task good for getting familiar with Amaranth.

tilk commented 1 year ago

Integrating this functionality into the core might be nontrivial! And here is why.

If I understand the MulUnit correctly, it already is capable of wide multiplication, but it isn't able to store both the low and high word of the same operation - as the FU interface doesn't allow this.

One might think that some support might be "hacked" into the MulUnit, for example by remembering the previous operation, and not repeating the calculation when two fusible instructions enter the unit in sequence. But this is fragile, as the core is out-of-order and the fusible instructions might not be issued to the multiplication unit in the same order they were in the code.