Open LPeter1997 opened 2 years ago
Idea: extending the assembly syntax
An odd place to start, but if we want a clean, readable syntax, we need to allow a bit more than just a name + arguments in parenthesis syntax. For example, in the initial post I've laid out an example syntax for a virtual call instruction extension: virtcall my_obj.foo(arg1, arg2, ...)
. This looks way better (and is way more readable) than a function-like macro, for example virtcall(foo, my_obj, arg1, arg2, ...)
.
Extending a parser can be problematic for efficiency, but I believe that a small compromise wouldn't hurt anyone to get rid of the efficiency problem: we could allow the user to register parsers for custom instruction names. From there on they are allowed to parse from the stream.
An example API usage for the above example:
public class VirtcallInstructionParser : IInstructionParser
{
// When the user registers this extension, this is the property that tells the parser, on which instruction name to hand over parsing
public string InstructionName => "virtcall";
public IInstruction Parse(InstructionTokenStream tokens)
{
// When this is called, the instruction name is already consumed, the user only has to parse the things after that
var obj = tokens.ParseValue(); // Parses a primitive value, like my_obj
tokens.Expect('.'); // Expect a dot separator and consume
var funcName = tokens.ExpectIdentifier(); // Expect the name of the called function
tokens.Expect('('); // Open parens for call
var args = new List<Value>();
// Comma separated args
if (!tokens.Peek(')'))
{
args.Add(tokens.ParseValue());
while (tokens.Matches(',')) args.Add(tokens.ParseValue());
}
tokens.Expect(')'); // Close parens for call
return new VirtcallInstruction(obj, funcName, args);
}
}
Overloading existing instructions
It might be a good idea to allow for overloading instructions with the same verb. For example, the add
should work both for integer and floating-point numbers, the backend should deal with the proper handling or translation. This is mentioned in the original issue already.
Another interesting use would be to allow overloading for custom types. A dynamic language would suffer, if it had to generate type-dispatch code each time an operation is performed. Allowing to define custom types and overloading instructions for these would solve this issue.
Note, that I'm not advising that this should be a mechanism for operator-overloading. I believe classical operator overloading requires a lot more sophisticated things than instruction overloading and should be handled by the compiler frontend/middleware. This would simply serve the purpose of making the IR generation for dynamic languages or languages with custom primitives simpler.
Another interesting use would be to define SIMD types for the IR. I believe that IR extensions should be strong enough that implementing SIMD operations on custom-defined SIMD types should be possible.
Okay, here goes my idea (part 1 of many).
The following ideas take inspiration in LLVM IR, Cranelift IR and CIL opcodes. The proposed IR language tries to achieve the following objetives, for which some design decisions are proposed:
%0 = load i32* %a.addr, align 4
but in Yoakke IR we should strive to keep the load operation simple and loyal to the core concept, like this:
%0 = load i32*, addr(a)
store i32 -> *i32
, where the arrow symbol is explicitly typed as part of the syntax. This makes it clear what value is being stored where.load
s and store
s.add
), and the specific small step semantics are defined based on the types of the arguments and the result. The following instruction variants are all different, but they are exposed as one instruction to the IR generator:
i64 = add i32, i32
i128 = add i64, i64
u64 = add u32, u32
I think the first step is defining a syntax that can be used to quickly and effectively share and talk about Yoakke IR. This syntax will be useful for discussions between collaborators and users of Yoakke, for bug reports, for community questions... This answers one of the original questions raised in the original comment: yes, we should have a canonical textual representation for Yoakke IR, and this textual representation should extend to the extensions made by users. This means, for example, that creating a macro would require the user to give it a name and then it would be printed following the same syntax rules that built-in basic instructions use. And, in order to enforce this canonical representation and be able to benefit from it, we should implement an utility capable of emitting this canonical textual representation automatically from at least one (but ideally all) of the execution-compatible forms of Yoakke IR (e.g. in-memory data structure, bytecode...).
This is an example function in C:
int sum(int a, int b) {
return a + b;
}
and this is a summary of the same function as unoptimized LLVM IR:
define i32 @sum(i32 %a, i32 %b) #0 {
entry:
%a.addr = alloca i32, align 4
%b.addr = alloca i32, align 4
store i32 %a, i32* %a.addr, align 4
store i32 %b, i32* %b.addr, align 4
%0 = load i32* %a.addr, align 4
%1 = load i32* %b.addr, align 4
%add = add nsw i32 %0, %1
ret i32 %add
}
attributes #0 = { nounwind ssp uwtable ... }
This is the same function in the proposed Yoakke IR language (also doing the same "useless" operations to be fair and to show syntax):
attributes #0 = [ foo, bar, baz ... ]
procedure @sum(a i32, b i32) -> i32 #0 {
entry(%0 i32, %1 i32):
store a i32 -> %0 i32*
store b i32 -> %1 i32*
va i32 = load %0 i32*
vb i32 = load %1 i32*
result i32 = add va i32, vb i32 [ overflow = saturate ]
ret result i32
}
Symbols (bindings) use no symbols, like for example result
. In case a name cannot be found or is not relevant, a symbol can be created with %
followed by a number (e.g. %0
). The symbol @sum
starts with @
because it is not local.
As defined in the original comment of this issue, attributes are added using brackets like this:
procedure @sum(i32 a, i32 b) -> i32 [ foo, bar, baz ... ] {
but in case the user wishes to reuse the same attributes in more than one place, a feature called attribute sets should be implemented that works like in LLVM:
attributes #0 = [ foo, bar, baz ... ]
procedure @sum(i32 a, i32 b) -> i32 #0 { ; equivalent, #0 expands to a valid list of attributes
The signature of the different instruction variants used above look like this:
i32 = add i32, i32 ; example: result i32 = add va i32, vb i32
store i32 -> i32* ; example: store i32 a -> i32* %0
i32 = load i32* ; example: va = load i32* %0
*i32 = slot type#i32 ; example: %0 = slot i32
Following the convention of CIL opcodes, we could use a dot and a suffix on mnemonics to indicate related instructions. For example, addition with an immediate could reuse the mnemonic add
or use add.i
:
i32 = add i32, imm#i32 ; example: result i32 = add va i32, 7 i32
; OR
i32 = add.i i32, imm#i32 ; example: result i32 = add.i va i32, 7 i32
bN
represents a boolean of N bits. For example, b1
and b8
represent a boolean value stored in 1 and 8 bits, respectively. The literals for compatible values are true
and false
. The equivalence 0 = false and 1 = true should not be assumed.iN
represents signed integer values of N bits. Examples: i8
, i16
, i32
, i64
...uN
represents unsigned integer values.f32
and f64
represent floating point values.TxM
represents an array of M
elements of type T
. Length probably not stored. Examples: i64x341
, u32x4
...simd{TxM}
a SIMD vector type. Example: simd{i64x4}
T*
represents a pointer to a value of type T
. Examples: i32*
, f64*
...imm#T
represent an immediate of type T
. Example: 5
is a imm#i32
.offsetN
represents an address offset of N bits. Example: offset32
.type
represents the textual representation of one of types (instead of a value) shown above. Example: i32
is a type
.nop
No operation.T = load T*
Load from memory into register.store T -> T*
Store from register into memory.iY = add iX, iX
AdditionbY = and bX, bX
Logical ANDret
, ret T
Return from a methodjump ~> offsetN
Jump (jump = inconditional branch) to target.br.false bX ~> offsetN
Branch (branch = conditional jump) if value is false.br.true bX ~> offsetN
Branch if value is true.br.zero iX ~> offsetN
, j.zero uX offsetN
Branch if value is zero.br iX == iX ~> offsetN
, br uX == uX ~> offsetN
Branch if values are equal.br iX < iX ~> offsetN
, br uX < uX ~> offsetN
Branch if...br iX > iX ~> offsetN
, br uX > uX ~> offsetN
br iX <= iX ~> offsetN
, br uX <= uX ~> offsetN
iY = add iX, iX
, uY = add uX, uX
Addition.iY = sub iX, iX
, uY = sub uX, uX
Subtraction.iY = mul iX, iX
, uY = mul uX, uX
Multiplication.iY = div iX, iX
, uY = div uX, uX
Division.iY = rem iX, iX
, uY = rem uX, uX
Remainder.bY = and bX, bX
Logical AND.bY = or bX, bX
Logical OR.bY = xor bX, bX
Logical XOR.bY = not bX
Logical NOT.T* = slot type#T
Reserve a local slot (i.e. stack local variable in x64)T* = alloc type#T
Allocate dynamic memory (i.e. heap)free T*
Free dynamic memory (i.e. heap)U = conv T, U
Convert types by running necessary algorithms on runtime.U = transmute T, U
Reinterpret types statically.V* = getfieldptr T, U
Get field of a compound type as pointer.V = getfield T, U
Get field of a compound type as value (I don't advise it: less granular than pointer version).T = ldelement TxM uX
Get element of array by index.stelement T -> TxM uX
Set element of array by index.call procedure(T, U...)
, R = call procedure(T, U...)
Call another procedure.??
Call a native function.dbg.break
Inform a debugger that a breakpoint has been reached.I am not entirely convinced by the API shown in the example, because it requires accessing the fields here:
var cmpResult = builder.Cmp(Comparison.Less, arg, i32.NewValue(2));
var lastBb = builder.CurrentBasicBlock; // <----
I think we could take inspiration from Microsoft's System.Reflection.Emit
API and define a diferent builder for each logic level using the Builder pattern, so the Fibonacci example could be like this instead:
var assembly = new AssemblyBuilder();
var fib = new ProcedureBuilder("fib").WithParameter(i32, out var arg);
var then = new BasicBlockBuilder("then").Ret(i32.NewValue(1)).Build();
var @else = new BasicBlockBuilder("else")
.Sub(arg, i32.NewValue(1), out var arg1)
.Call(fib, new[] { arg1 })
.Sub(arg, i32.NewValue(2), out var arg2)
.Call(fib, new[] { arg2 })
.Add(fib_1, fib_2, out var result)
.Ret(result)
.Build();
var start = new BasicBlockBuilder("entry").Br(Comparison.Less, arg, i32.NewValue(2), then, @else).Build(); // not compatible with branching instructions defined above, but ignore this detail for now
fib = fib.WithEntryAt(start).Build();
var assembly = assembly.WithProcedure(fib).Build();
My initial thoughts on the proposal. Overall: very nice, mentions a lot of important aspects. Well done!
[InstructionSyntax("{Target} <- {Source}")]
public record Load(Value Target, Value Source);
@sum
snippet, but I have a couple of nitpicks. Why would we need to annotate non-locals with @
(or even, why annotate them at all)? For the linker, we could just push that into the attribute list. Why differentiate "bindings" from other locals? They look like they serve the same purpose. Attribute sets are a good idea, nice convenience feature.br.true
and br.zero
? If so, this mostly feels like a syntactic thing, allowing dots in instruction verbs.br
variants. I think this is not a nice, orthogonal way. A separate comparison instruction would be more general: it doesn't have to be used for a jump argument, it can also be just a computation result. To me this feels like it was straight ripped out of the x86 jump variants section, a bit too CISC-like.getfield
, feels redundant (and looks like you do too).ldelement
and stelement
also feel redundant, I feel like all should be solved with getfieldptr
or some common instruction for them.call
.dbg.break
reminds me that we need to think about debug symbols too.
Just to get my hands dirty (and because I couldn't really just not work on it), I did a micro-experiment of how we would model types. For a challenge, I tried to model a type, that's a signed integer and can exactly store a pointer type on the given architecture. The construction code - without any builders - looks like so:
var voidPtr = new Types.Ptr(Model.Types.Void.Instance);
var voidPtrSize = new Values.SizeOf(voidPtr);
var intPtr = new Types.Int(voidPtrSize);
Console.WriteLine(intPtr);
This prints i{size_of(*void)}
, which means "a signed integer with as many bits as size_of(*void)
returns". This makes the model a bit more complex: we can't just have an integer argument as the bits width for the integer type, we need some IConstant
instead that represents compile-time constant values.
I'm not yet sure that this complexity is worth it, but might be fine, if we can erase it at a later stage, when we receive platform-specific information. This would mean that we can do platform-independent optimizations while not resolving things like this, but as soon as we desugar things like size_of
, more extreme optimizations can take place.
Maybe this also begs for aliases, reading i{N}
or i{size_of(...)}
can become pretty noisy, so type-aliases should be considered, so we could say type i32 = i{32}
for example.
Note, that this makes type-checking a bit too strict, but that could be considered a good thing. i{size_of(*void)}
cannot be known exactly until choosing a back-end, so it is only really compatible with the exact same signature, i{size_of(*void)}
. Again, we should consider, if we want/need this complexity.
I believe that a strong attribute system could help us in avoiding a lot of noise in the core IR itself. Attributes could contain debug information, as well as things, like the result of register allocation. It could be taken to the extreme, where the user could literally do manual register-allocation, if they wish so:
v2 = add v0, v1 [register = r0]
Many optimizations could be enforced, like [tail_call]
, [force_inline]
, ...
I'm sure there would be many-many uses for the attribute system, so we should make it rich, extensible and easy to read/write in the API.
The one thing we should avoid is having the attribute system modifying the semantics of the IR. Specifying platform-specific behavior should be fine, but attributes should be useful metadata, nothing more. The IR should execute the same on its theoretical machine without the attributes.
Since the symbols module is soon going to reach a state where it's relatively usable, I believe it's time to think about the next major step: compiling/executing. Since most compilers introduce some intermediate representation to help with multiple things (ease of compilation, easier optimization, ...), I believe this is the way to go.
We should come up with our own intermediate representation that could either be executed in a VM relatively efficiently, or be compiled to backends we want (like x86, WASM, MSIL, maybe even C, ...). This representation should be:
Level of abstraction I believe that we should stick close to the CPUs of today, but abstract away just enough not to make common operations a chore.
For example, on x86 there are different instructions for floating-point and integer arithmetic. Also, adding integers larger than a register means keeping track of the carry with the
adc
instruction. I believe that if we have a very simple type-system that can uniquely identify what we want, we could have way less instructions.I believe that we should have few instructions that do elementary operations. For anything fancier, we could add intrinsics or extensions later.
An initial, small set of instructions should be sufficient:
We will definitely need a few more when planning out the language more, like typecasting (reinterpret/transumte) or getting the pointer for a member in a structure (something like
getelementptr
in LLVM).We need to be able to represent everything that can be architecture-dependent as non-hardcoded constants. For example, the size of pointers are platform-dependent, so the IR shouldn't contain hardcoded
4
or8
values, but something likesizeof(ptr)
instead.Code structure I believe that a flat hierarchy should be enforced: no nested procedures or types. We don't need many levels either:
The instructions should be in SSA form, every temporary gets a unique name and only gets assigned once. Mutables could be achieved through memory indirection (with read and write instructions), similarly to LLVM.
Types I don't think we need a very sophisticated type-system. We need to define the built-in primitive types, and allow structs and untagged unions to be defined. All operations/instructions should be typed.
Behavior To avoid UB, we have to be very explicit about defining each operation to something sensible. For example, addition could overflow, but we need define exactly, what should happen in that case. We could either let it wrap around, panic, or do whatever the platform does.
Ideally, we would be strict and simply panic, unless we somehow specify that we want a specific behavior. We could introduce attributes for example, to essentially parameterize instructions:
local_3 = add local_2, local_1 [overflow=wrap]
Attributes would also help specifying things, that are required for certain platforms. For example, a common one would be the link name and calling convention for procedures:
Extensibility for abstractions I believe it's very important to let the user extend the language with abstractions they need. For example, it would be very painful to implement a dynamically typed language with just strongly typed elements. While I believe that this is a detail that should be specified by the user, we should still help them to be able to abstract it away.
I have thought of 2 ways of extending the IR so far:
For the former, a good example would be virtual calls. Let's say, that a virtual call would consist of these elementary instructions:
When looking at the IR, this is noise. Ideally, the user would just see
virtcall my_obj.foo(arg1, arg2, ...)
. Some kind of macros or instruction templates could solve this.For the latter, an example would be generators. To convert a procedure into a generator procedure, a lot of analysis and transformation has to be done, so it'd be better, if the user would just receive the entire thing for transformation. Usage example would be:
Here, the user would receive the model for
get_123
, so they can transform it however they want.I believe these two kinds of abstractions would fit many cases and are not too hard to implement.
Extensibility for code passes I think that many domains will come with their own code-generation patterns. While having optimization passes for every single existing situation would be neat, it would also be very much infeasible. Allowing to define custom code-passes would be great.
Static Analysis tools Since we plan on the users writing their custom extensions, manipulating the IR should be as simple as possible. We should provide all the tools we can, to be able to analyze and correctly transform the code.
Here I think about things like providing a Control-Flow graph and related algorithms, custom register allocation and so on. Essentially, we should open up as much of the analysis, as possible. Otherwise, the user would have to reimplement a lot of what optimizations and other transformations do internally.
Code building Building IR code should be supported by some builder utility, that would keep track of the current basic block, and such. For instructions that result in some temporary value, it could also return an object that can be used to reference that value. For example, building a recursive Fibonacci could look something like:
This might not be an ideal API, but looks decent enough in my opinion.
Very much open questions