Closed Elarnon closed 5 years ago
The code generation currently works in two steps: first, we generate a loop tree from the CSP solution; then, we walk over this loop tree, printing code on the fly. This has advantages in terms of conceptual complexity; however, it has some issues -- notably, it leads to a high level of code duplication between backends (because the printing structure is not always exactly identical); and it makes it painful to materialise additional code such as that required for convolutions or predicated loads.
Newline here
In order to solve those issues, we propose to introduce a low-level IR that sits on top of the backends' printed code; with the goal to add an extra step to the code generation process: the loop tree should first be converted to this new IR, which contains lowered instructions and implementation details (for instance, this IR should be explicit about loop indices and induction variables; in fact, the resulting code should be mostly independent from the high-level Telamon concepts); then, the IR can be converted to actual backend code.
Slightly excessive use of ;
;)
This patch is a first step towards that direction. It introduces a (very) bare-bones version of the concept in the
codegen::llir
module (low-level IR, in contrast to the higher level existingir
module), which is meant to define a more traditional IR defined for fixed candidates. Currently, thellir
module defines abstract types to represent registers (named variables which can be written to) and instruction operands; the registers and operands are given as argument to theInstPrinter
methods instead of raw strings. Finally, it provides aScalarOrVector
type to represent vectors of either registers or operands.The introduction of these types allow to decouple some parts of the code,
allows
notably:
* The `NameMap` is freed from the responsibility to know about the target's literal formatting to generate strings; instead, it can generate registers or operands as appropriate. This allows to completely remove the `get_const_float` and `get_const_int` methods from the `ValuePrinter` trait, instead passing on that responsibility to the `InstPrinter` when it sees the corresponding operands. * In a similar vein, a trip through the `InstPrinter` is no longer needed for vectorization -- instead, the `NameMap` can directly return a vector of registers (or operands) which is then given to the `InstPrinter`, and can be printed through the regular printing path.
To ease the transition, this patch also makes some additional changes; notably:
* The `ValuePrinter` is renamed to `NameGenerator`, since its only responsibility is now to generate names for variables and parameters. The `NameGenerator` currently still goes through a trait and is implemented separately by each backend; however, the restructuration should make it easy
restructuring
to use an unified `NameGenerator` type in a subsequent patch. This is made possible by the indirection through the `llir::Register` type for printing, which allows backends that need it to add (type-based) prefixes or suffixes at printing time. * The `NameMap` now uses an arena instead of raw strings. This allows returning long-lived references to register names; this greatly simplifies usage of the `NameMap` due to eliminating most of the mutable lifetime conflicts. * The printing of registers and operands goes through the newly introduced `PTXDisplay` and `C99Display` traits, which are similar in spirit to the `fmt::Display` trait but formats values according to PTX (resp. C99) syntax. They are used by the `InstPrinter` implementations. * Finally, as a preparation for future patches, the helper methods in the `InstPrinter` trait have been extracted to helper structures instead; this is in a first step towards re-unifying most of the actualy code generation
actual
between the different backends to end up with a single list of a (yet-to-be-introduced) `llir::Instruction`s to be given to the backend instead.
Conceptually, this patch doesn't do much things expect change various type
many things / much, except
representations; as such, the reader is encoureged to first look through the
encouraged
codegen::llir
module to read about the new types being introduced; the changes tocodegen::name_map
andcodegen::printer
should then be fairly straightforward (but pay attention tovector_operand
andvector_inst
which have been moved from theInstPrinter
). Finally, the changes to the backend printers should also be mostly straightforward, merely updating to the new API; the only real changes being the introduction of theC99Display
andPTXDisplay
traits used to display registers and operands.
The code generation currently works in two steps: first, we generate a loop tree from the CSP solution; then, we walk over this loop tree, printing code on the fly. This has advantages in terms of conceptual complexity; however, it has some issues -- notably, it leads to a high level of code duplication between backends (because the printing structure is not always exactly identical); and it makes it painful to materialise additional code such as that required for convolutions or predicated loads. In order to solve those issues, we propose to introduce a low-level IR that sits on top of the backends' printed code; with the goal to add an extra step to the code generation process: the loop tree should first be converted to this new IR, which contains lowered instructions and implementation details (for instance, this IR should be explicit about loop indices and induction variables; in fact, the resulting code should be mostly independent from the high-level Telamon concepts); then, the IR can be converted to actual backend code.
This patch is a first step towards that direction. It introduces a (very) bare-bones version of the concept in the
codegen::llir
module (low-level IR, in contrast to the higher level existingir
module), which is meant to define a more traditional IR defined for fixed candidates. Currently, thellir
module defines abstract types to represent registers (named variables which can be written to) and instruction operands; the registers and operands are given as argument to theInstPrinter
methods instead of raw strings. Finally, it provides aScalarOrVector
type to represent vectors of either registers or operands.The introduction of these types allow to decouple some parts of the code, notably:
The
NameMap
is freed from the responsibility to know about the target's literal formatting to generate strings; instead, it can generate registers or operands as appropriate. This allows to completely remove theget_const_float
andget_const_int
methods from theValuePrinter
trait, instead passing on that responsibility to theInstPrinter
when it sees the corresponding operands.In a similar vein, a trip through the
InstPrinter
is no longer needed for vectorization -- instead, theNameMap
can directly return a vector of registers (or operands) which is then given to theInstPrinter
, and can be printed through the regular printing path.To ease the transition, this patch also makes some additional changes; notably:
The
ValuePrinter
is renamed toNameGenerator
, since its only responsibility is now to generate names for variables and parameters. TheNameGenerator
currently still goes through a trait and is implemented separately by each backend; however, the restructuration should make it easy to use an unifiedNameGenerator
type in a subsequent patch. This is made possible by the indirection through thellir::Register
type for printing, which allows backends that need it to add (type-based) prefixes or suffixes at printing time.The
NameMap
now uses an arena instead of raw strings. This allows returning long-lived references to register names; this greatly simplifies usage of theNameMap
due to eliminating most of the mutable lifetime conflicts.The printing of registers and operands goes through the newly introduced
PTXDisplay
andC99Display
traits, which are similar in spirit to thefmt::Display
trait but formats values according to PTX (resp. C99) syntax. They are used by theInstPrinter
implementations.Finally, as a preparation for future patches, the helper methods in the
InstPrinter
trait have been extracted to helper structures instead; this is in a first step towards re-unifying most of the actualy code generation between the different backends to end up with a single list of a (yet-to-be-introduced)llir::Instruction
s to be given to the backend instead.Conceptually, this patch doesn't do much things expect change various type representations; as such, the reader is encoureged to first look through the
codegen::llir
module to read about the new types being introduced; the changes tocodegen::name_map
andcodegen::printer
should then be fairly straightforward (but pay attention tovector_operand
andvector_inst
which have been moved from theInstPrinter
). Finally, the changes to the backend printers should also be mostly straightforward, merely updating to the new API; the only real changes being the introduction of theC99Display
andPTXDisplay
traits used to display registers and operands.