t-crest / patmos

Patmos is a time-predictable VLIW processor, and the processor for the T-CREST project
http://patmos.compute.dtu.dk
BSD 2-Clause "Simplified" License
135 stars 72 forks source link

ISA change proposal: split/deferred instructions #72

Open Emoun opened 4 years ago

Emoun commented 4 years ago

This issue will track the discussion into changing the Patmos ISA to make use of either deferred or split instructions.

Motivation

Some types of instructions cannot be executed without incurring some kind of delay or latency in the pipeline. One example is load instructions, which currently have a 1 cycles delay slot before the loaded value can be used. Another example could be a multiply or division instructions, which requires multiple cycles to execute. Deferred/split instructions try to address the inefficiency in instructions with latency, by allowing the compiler decide how to manage this latency.

Split instructions

Split instructions "split" a given instructions into two parts: (1) issue the instruction and (2) get the result. E.g., loads could be split into issuing the load (lwc, load word from data-cache) and then putting the loaded value into a register (glw, get loaded word). The two parts of the load can then be scheduled independently by the compiler, to try and avoid any latency by issue other instructions between them.

Example:

lwc t1 = [r1]    ; issue load of address in r1 to load-register t1
add r2 = r3, r4  ; do something else
add r2 = r2, r5  ; do something else
glw r1 = t1      ; get loaded value from load-register t1 into register r1
add r2 = r2, r1  ; use loaded value

Deferred instructions

Deferred instructions try to address the same problem with a different approach. In addition to providing an instructions with the usual operands, it is also provided with an immediate value operand that specifies when the result is expected. The immediate value defines after how many instruction words the result should be available in the target register. The compiler can then use this immediate value to issue the instruction early and set the value to match when in the instruction stream it needs the result.

Example:

lwc r1 = [r1], 3 ; Issue a deferred load, with the value available to the third following instruction
add r2 = r3, r4  ; do something else
add r2 = r2, r5  ; do something else
add r2 = r2, r1  ; use loaded value

The deferral range is not specified yet, but suitable ranges could be between 32 and 256.