doe300 / VC4C

Compiler for the VC4CL OpenCL implementation
MIT License
118 stars 37 forks source link

Add instruction scheduler #19

Open doe300 opened 6 years ago

doe300 commented 6 years ago

Not sure how to implement this, but here a few notes on an instruction scheduler:

Goal

Reorder instructions within a basic block to utilize the delay introduced by certain operations by inserting meaningful instructions minimizing the number of cycles spent waiting (via nop or on periphery registers).

Target features

Implementation

See also the Wikipedia.

nomaddo commented 6 years ago

Can you assign me to the issue? Now I am trying to implement list scheduling. The algorithm is as follows:

This scheduling can be applied for one basic block, not for across basic blocks. This implementation is relatively simple and understandable.

The problem is, the heuristics that now we adopt to choose an instruction from ready-queue. Probably, we need to care memory latency and register-file or so on. This must be constructed in trial and error...

doe300 commented 6 years ago

Preparation: create DAG (Dependency Graph)

You could re-use the vc4c::Graph class, if it suits your needs. It is already used for the basic-block dependency graph and the colored graph for register mapping. Also, the class DebugGraph provides an easy way to print the contents of the graph into graphviz file.

nomaddo commented 6 years ago

Question: I am wondering when VC4C translate SSA to non-SSA before register-allocation. Come to think of it, Instruction Scheduling fits non-SSA, and equal to an assembly except that is it not register allocated.

It may be possible to implement the scheduler for SSA-form language, but a few improvable points may be remain.

doe300 commented 6 years ago

The SSA from the intermediate language (LLVM-IR or SPIR-V) is relaxed very early (before the optimization steps are run) to resolve phi-instructions (function eliminatePhiNodes). So the internal representation of the optimization steps (up to the register-mapping) is SSA-ish (most of the instructions are SSA, but it is not guaranteed that a local is only written once).

doe300 commented 6 years ago

Intermediate status update:

Some statistics (based on TestVC4C --test-emulator):