The current interface has the register file data as part of the instruction issue, although with valid bits that are apparently meant to be allowed to be asserted several cycles later.
I'm sure this will be fine for some micro-architectures, but when forwarding is needed it may cause unnecessary stalls.
For example, on the NOEL-V (stages: fetch, decode, register access, execute, memory, exception, write-back) we currently issue FPU instructions from the execute stage (will be moved to an earlier point in the future), but results from the past few integer instructions may not be available until a couple of cycles later (besides memory accesses, there is also another set of ALUs in the exception stage). So, waiting on that data forwarding before issueing would cause a couple of stall cycles.
Instead we treat the register accesses the same way as memory reads, and pass along the data after the exception stage (providing the instruction ID to identify it). That way we can each cycle issue an FPU instruction that requires data from the integer side - it will just not see the data until some cycles later (the FPU pipeline must accept it whenever it arrives, stalling at some point if necessary).
Note that there is no requirement that the register data must be late. The exact same interface (thanks to the supplied ID) could be used to pass data earlier (but doing it from multiple points in the pipeline would make things more complicated, obviously).
In our case we use the exact same mechanism as for memory reads, since the FPU instructions never take more than one integer argument. For CV-X-IF there will need to be several "channels" for register data (well, I suppose it could be possible to allow for multiple consecutive data returns on fewer "channels", if a register number was passed along - might be useful for smaller implementations).
The current interface has the register file data as part of the instruction issue, although with valid bits that are apparently meant to be allowed to be asserted several cycles later.
I'm sure this will be fine for some micro-architectures, but when forwarding is needed it may cause unnecessary stalls.
For example, on the NOEL-V (stages: fetch, decode, register access, execute, memory, exception, write-back) we currently issue FPU instructions from the execute stage (will be moved to an earlier point in the future), but results from the past few integer instructions may not be available until a couple of cycles later (besides memory accesses, there is also another set of ALUs in the exception stage). So, waiting on that data forwarding before issueing would cause a couple of stall cycles.
Instead we treat the register accesses the same way as memory reads, and pass along the data after the exception stage (providing the instruction ID to identify it). That way we can each cycle issue an FPU instruction that requires data from the integer side - it will just not see the data until some cycles later (the FPU pipeline must accept it whenever it arrives, stalling at some point if necessary).
Note that there is no requirement that the register data must be late. The exact same interface (thanks to the supplied ID) could be used to pass data earlier (but doing it from multiple points in the pipeline would make things more complicated, obviously).
In our case we use the exact same mechanism as for memory reads, since the FPU instructions never take more than one integer argument. For CV-X-IF there will need to be several "channels" for register data (well, I suppose it could be possible to allow for multiple consecutive data returns on fewer "channels", if a register number was passed along - might be useful for smaller implementations).