Hazards handling - Githubissues

Hi @so1uit, yes the pipeline handles these hazards.

The core handles data hazards mainly through bypassing (forwarding) in the bypass unit and stalling (interlocking). For each of the two possible source registers, the bypass unit will compare the destination register of the next two stages to see if they are writing to this register; if so, it will update the source register with the youngest of the writes to that register. There are a few complications with this.

First, there are load use hazards. Since loads are not available on the same stage as ALU results, they cannot be forwarded from the MEM stage, so the pipeline must be stalled, inserting bubbles at the output of the EXE stage until the load value is available.

Second, the execute stage can stall while a write is still in a later pipeline stage. This means the updated value can drain out of the pipeline, but the value in execute would not be updated. for this reason, there is special logic to capture retired instructions that write to the source register while the execute stage is stalled. This will capture reads to a the source register and forward them once the stall clears.

Control Hazards are a bit more complicated. I find the Program Counter Target Synchronizer to be the least intuitive single component in the processor, as it is entirely there to address edge cases in the control flow. There are 5 main ways the control flow may move the program counter:

Normal instruction (PC+4)
Compressed Instruction (PC+2)
Branch, Return, or Jump (target determined in execute stage)
Trap return (target is mret, sret, etc register in CSR, resolved in execute stage)
Exception or Interrupt (target is mtvec, stvec, etc + mcause/scause/etc offset if applicable from the CSR with exceptions committing at the end of the pipeline)

Determining if an instruction is compressed or normal is done as soon as the fetched instruction is available. If a stall happens, the size of the last instruction must be remembered as the instruction in decode moves on down the pipeline.

A branch is determined in the execute stage. Since the pipeline is relatively short, the branch prediction is very primitive; the core assumes the branch is not taken, and if the branch is taken, then the instructions fetched after the branch are squashed (invalidated). There are some complications here as well.

If the pipeline stalls for a fetch stall, then the branch target from execute must be captured before the instruction in execute moves down the pipeline. When the stall clears, then the branch target must be restored to the correct branch target.
There is an exception, however, when the pipeline stalled because of a load use hazard when the data dependency was used in the branch condition evaluation. In this case, we do not want to capture the branch target until after the load use stall has cleared.
This may happen at the same time as other stalls, so the target is captured after the load use stall, but is preserved until all stalls are cleared.

Trap returns are very similar to branches except that they are not susceptible to load use hazards.

Exceptions and interrupts are similar, except that they are handled at the end of the pipeline to ensure precise timing of the exception in program order. Because of this, there are fewer complications for this type of control flow.

Peter-Herrmann / Lucid64

Hazards handling #4