cvut / qtrvsim

RISC-V CPU simulator for education purposes
GNU General Public License v3.0
466 stars 56 forks source link

add fp32 instructions decode map #109

Closed Jingqing3948 closed 4 months ago

Jingqing3948 commented 5 months ago

Hello! I have recently been trying to implement an f extension for qtrvsim. The most recent progress has been in the decode phase of F_inst_map, which means that f32-bit instructions are correctly recognized, but float instructions cannot be executed for the time being. Sorry this is my first attempt to contribute to an open source project, I am not sure whether these changes are worth a pull request? If you have any modification suggestions, you are welcome to correct, thank you very much for your view. image here is an example of successfully identify load-fp instruction: image

ppisa commented 5 months ago

Thanks for contribution. Decoding map would be useful but I do not plan to include them until at leas some basic implementation is available. There are more fundamental things to decide. If only functional side of the floating point instructions is considered then it is relatively easy to implement them in the memory stage. It is commit stage for our simple pipeline design, so there is no risk to run operation speculatively with unwanted permanent effect there. On the other hand for such design of instructions execution only without demonstrating their passing through pipeline you can use RARS. It has nice features and interactive help system etc. But it cannot demonstrate principles of pipelined execution. We have considered extending MARS (MIPS predecessor of RARS) at the start of our QtMIPS/QtRVSim project but analysis has shown that to extend MARS/RARS to demonstrate pipeline at intended level, it would require deep re-architect.

So I have no stronger opinion and idea if and which way include floating support into QtRVSim. Ideas welcomed. May be implement some unit with some out of order execution, but then main problem is how to fit such subsystem into graphic design to keep principles understandable.

I have idea how to extend simulator to support more cores, include virtual memory support and other interesting options such way that it would keep understandable. But even these steps are on far foresight plan. One of our students has chosen branch prediction and its visualization as the thesis goal and I have provided my idea how make it well understandable. I am waiting for result to consider that for inclusion.

Jingqing3948 commented 5 months ago

yes I understand that after cli implementation, the gui also needs to represent the 32 float regs status. Anyway I'll keep on working on it and try not to affect other functions.

ppisa commented 5 months ago

The window with floating points registers state is the easy part. Problem is if and how to represent floating point path in the CPU diagram and if all floating point instructions should be executed in the single cycle in memory stage. Which is not matching real behavior at all. Other option is to keep latency of the FPU functional units single cycle but add some pipeline and forwarding around. This would have consequences to introduce forwarding circuitry for FPU path and some rules for passing data between floating point and general purpose registers. All that is doable but start to complicate design and attempt to represent it on the screen can be even more problematic and can cause more confusion to novices than good. So experimental QtRVSim version with FPU support is interesting to have but to make it into state that it would worth to be included in mainline can take long time.

ppisa commented 4 months ago

I am opening issue #128 to guide some following attempt programmers to this pull request to not lost the invested effort. Thanks for this work, there is chance that it will be reused as base of future enhancements.

Jingqing3948 commented 4 months ago

I update my instruction execution, sry I made many changes. My logic is:

  1. in instruction.cpp, I add 3 new args to indicate whether this register is fp or gp. D, S, T means fp and d, s, t means gp.
  2. I add some new instruction flags to indicate whether this instruction will use fp registers and FALU for float calculation. FLW and FSW don't need new calculation method, only fadd.s and fsub.s need the IEEE754 standard calculation method, so I think only when this instruction uses 3 float registers (D, S, T) will it use FALU for float add.
  3. I modify the InterStage struct to pass information between decode, execute and writeback stage about whether the float destination register needs to be used.

What do you think about it? For further float instructions implementation this may have problem such as I want to transfer float number from fp register to gp register.