Open Mesabloo opened 2 years ago
Note on boolean instructions: there are no direct equivalent in modern machine code (x86 does not have such concept of "boolean").
notb %r0
will therefore be compiled to code similar to (on x86)
CMP $0, %rax # Compare %r0 with 0, set ZF=1 if equal
SETE %al # set %r0[56:8] to ZF
MOVZX %al, %rax # Copy %ro[56:8] to %r0, filling %ro[0:56] with zeros
The last two operations could be replaced with
CMOVZ $1, %rax # set %r0 to 1 if ZF=1
CMOVNZ $0, %rax # set %r0 to 0 if ZF=0
although performance could be different (because CMOVcc
needs to look at the flags every time, whereas only SETE
looks at them above).
As there is no notion of "compare flag (CF)" and "zero flag (ZF), the cmp
operation cannot exist in N⋆. This explains the need for boolean instructions.
On ARM, we may have these opcodes
cmp r0, #0
ite eq /* if-then-else */
moveq r0, #1 /* r0 = 1 iff r0 = 0 */
movne r0, #0 /* r0 = 0 iff r0 = 1 */
uxtb r0 /* 0-extend %r0 */
On ARM64 (AArch64) we have
cmp w0, #0
cset w0, eq
and w0, w0, K /* where K depends on the size of w0, as long as it is 1-filled (e.g. 0xFFFF) */
The idea is that the and
is used to 0-fill the register.
On MIPS, we may have
sltu $0, $0, 1
andi $0, $0, K # same as above for ARM64
Conditional moves are also a very good part of the instruction set. They avoid most of branches.
See the example above for notb
, which uses conditional moves. If they were not available, one would have to define notb
as the following:
cmp %rax, $0
jz set1
set0:
mv $0, %rax
j end_not
set1:
mv $1, %rax
end_not:
# ...
This is very heavy and makes quite a lot of jumps… And it's also not quite readable compared to the earlier version.
In N though, this control flow is a lot* heavier:
# ...
jz %r0, set1, set0
set0: ...
mv %r0, 1;
jmp end_not<...>
set1: ...
mv %r0, 0;
jmp end_not<...>
end_not: ...
# ...
If we take a look at the omitted type, one will see that the types for both set0
and set1
are exactly the same (they require the continuation in the same space, the same registers, and will return the same registers.
end_not
on the other end is the same type as for set0
with the added constraint that it must take %r0: uN
(which we may not propagate in both set0
and set1
).
So there is quite a lot of redundancy for that few instructions, which conditional moves actually solve (by integrating them within the type inference rules).
As we have every kind of branching instructions (see above), we will also need all the conditional moves for the same “branchings”.
Currently, there are too few instructions included in N*.
Here's an exhaustive list:
jmp
unconditionnally jumps to the given label;call
is a restricted form ofjmp
which accounts for a "return address"ret
jumps to the address in the continuation spacesld
loads a value from the stack into a registersst
stores the value inside a register into the stack at the given indexsalloc
allocates some space on top of the stack (similarly toalloca
in C, though the size is restricted by the type instead of being a literal integer)sfree
removes the top-most stack cellsref
allows fetching a reference (pointer) to a given stack cellld
loads a value from a memory address into a registerst
writes a value inside a register at a memory addressmv
moves a register or immediate value into a registernop
is useless and does absolutely nothing more than take space in the resulting objectHowever, there are quite a lot more useful instructions to have in order to make N⋆ usable as both a programming assembly language and a compiler backend for Zilch. Here are some examples:
add
,sub
,mul
, etc. with floating point counterparts (addf
,subf
,mulf
, etc. where thef
suffix stands for “floating”)jz
,jnz
,je
,jne
,jl
,jle
,jg
,jge
, etc. which act as simple jumps (e.g.jz %r0, then, else
jumps to thethen
label if the content of%r0
is0
, else jumps to theelse
label)and
,or
,xor
,not
, etc. Boolean logic operators may also be of great use as specific instructions, in order not to fall into traps such asnot 5 != 0
(we could havenotb 5 = 0
, orandb 1 2 = 1
, where every non-zero value is considered the truth value)Of course this list is absolutely not exhaustive (although it should be sufficient in the beginning). Many more instructions could be added (even for specific things such as SIMD), as long as they can be typed properly inside N⋆. All the above instructions will need to have types, but this can be done while adding them to the language/compiler.
Roadmap
and
: bitwise logical ANDor
: bitwise logical ORxor
: bitwise logical XORnot
: bitwise logical NOTshiftl
: bitwise logical LEFT SHIFTshiftr
: bitwise logical RIGHT SHIFTcmvz
: conditionally move if zerocmvnz
: conditionally move if not zerocmvl
: conditionally move if less thancmvle
: conditionally move if less than or equalcmvg
: conditionally move if greater thancmvge
: conditionally move if greater than or equalcmve
: conditionally move if equalcmvne
: conditionally move if not equalcjz
: jump if zerocjnz
: jump if not zerocjl
: jump if less thancjle
: jump if less than or equalcjg
: jump if greater thancjge
: jump if greater than or equalcje
: jump if equalcjne
: jump if not equaladd
: add two integerssub
: subtract one integer from anothermul
: multiply two integers togetherdiv
: divide one integer by another