Closed tekknolagi closed 1 year ago
for now, I recommend doing what GAS does: Run the C preprocessor over the file, assuming that works I've done that for an entirely custom assembler, where I could guarantee no syntax conflicts (it was for microcode), but i don't know if it'll agree with customasm.
The only problem with that is that I don't think it will like customasm's #ruledef
, etc, since they are not valid C preprocessor directives.
I agree, macros (and other directives like #define
and #ifdef
) are the only thing this assembler needs to be perfect
I think maybe moonheart08 is suggesting doing something like this?
#define LOADI_MACRO(r, v) 13`4 @ r`3 @ v
#ruledef
{
loadi {reg}, {val} => LOADI_MACRO(reg, val)
zero {reg} => LOADI_MACRO(reg, 0)
}
...and running the C preprocessor on the file (but I haven't tested this code).
I think the idiomatic way of doing this right now would be to just copy the expression and change what's needed by hand. Do you have a lot of code that would benefit from macros like those?
I could see some syntax like the following for the feature:
#ruledef
{
loadi {reg}, {val} => 13`4 @ reg`3 @ val
zero {reg} => resolve(loadi reg, 0)
}
...but I haven't really thought about the problems that could arise here. Currently instruction invocations are parsed and treated very differently to general expressions.
So what I meant by "The only problem with that is that I don't think it will like customasm's #ruledef, etc, since they are not valid C preprocessor directives." is that this error happens:
/tmp/foo:3:2: error: invalid preprocessing directive #ruledef
#ruledef
since #ruledef
starts with a #
.
I think a resolve()
would look pretty cool. I took a brief stab at adding it as a function, but quickly found out that that would not work in the existing framework.
Perhaps it would be worth adding some kind of preprocessor as part of customasm, perhaps not.
Do you have a lot of code that would benefit from macros like those?
Not exactly. A friend and I are working on a hobby project. Part of the request is from a want for an easy way to do two assembly syntaxes -- think AT&T / Intel syntaxes. Another part is a general want for a way to compose instructions into "higher-order" instructions by macro substitution. Kind of a low-tech compiler.
I run my code through the C preprocessor, cpp. After some experimentation this is the set of command line options I came up with:
cpp -x assembler-with-cpp -nostdinc -CC -undef -P test.s > test.asm
It seems to ignore #ruledef
etc. I have these defines for each instruction format:
#define ri(op, r, imm) \
r`4 @ op`4 @ \
imm`8
#define i8(op, imm) \
op`4 @ 0b0000 @ \
imm`8
#define rr3(op, rd, rs) \
rd`4 @ 0b0010 @ \
op`4 @ rs`4
#define rr4(op, rd, rs) \
rd`4 @ 0b0001 @ \
op`4 @ rs`4
#define r(op, rd) \
rd`4 @ 0b0001 @ \
0b0000 @ op`4
Not ideal though. When I saw #subruledef
I got excited thinking I could chain rules not just in the parameter list, but also in the body, but diving in the code, I couldn't see a way to do that. That would be much better than cpp's somewhat clunky syntax though.
+1 on some kind of #define-like preprocessor directive.
If you want to know how i did something similar in casmeleon:
.inline MAKE_SIB
.with ( base : Register, index : Register, scale : Ints ) -> {
.return (scale << 6) + ( index << 3 ) + base; // scale[7:6], index[5:3], base[2:0]
}
.inline SEGMENT_PREFIX
.with ( r : Segments ) -> {
.return r;
}
.inline MAKE_RM
.with ( r : Registers, d : Ints ) -> {
.return ( 3 << 6 ) + ( r << 3 ) + d;
}
.opcode move {{ dest, mod ptr [ segm:base + index * scaled ] }}
.with ( dest : Register, mod : x86Modifiers, ptr : PtrKeyword, segm : Segments, base : Register, index : Register, scaled : Ints ) -> {
.if scaled != 1 && scaled != 2 && scaled != 4 && scaled != 8 {
.error scaled, "Only 1 | 2 | 4 | 8 allowed as scale";
}
.out [ .expr SEGMENT_PREFIX(segm), 0x8B, .expr MAKE_RM( dest, + 0b100 ), .expr MAKE_SIB(base, index, scaled) ];
}
I've finally figured a sensible way for "calling" other instructions, so in v0.11.6 I'm introducing asm
blocks! It's not exactly a full-fledged "macro assembler" as per the issue title, but your original problem can now be solved like this:
#ruledef
{
loadi {reg}, {val} => 13`4 @ reg`3 @ val
zero {reg} => asm { loadi reg, 0x0 }
}
Note the asm
block is just like any other expression, so you can do things like concatenation:
#ruledef
{
loadi {reg}, {val} => 13`4 @ reg`3 @ val
zero_twice {reg} => asm { loadi reg, 0x0 } @ asm { loadi reg, 0x0 }
}
You can also specify multiple instructions inside a single block, one per line:
#ruledef
{
loadi {reg}, {val} => 13`4 @ reg`3 @ val
zero_thrice {reg} => asm
{
loadi reg, 0x0
loadi reg, 0x0
loadi reg, 0x0
}
}
The asm block will resolve instructions as if you were invoking them as regular source code, so for example, it will take all #ruledefs into consideration, and it can't call #subruledefs directly.
I think the asm block should help with a lot of use cases, and you can kind of abuse it into a (weak?) macro feature.
This is very cool! Thank you!
It's nice to finally have macro support, good job!
However, I don't know whether I'm using the asm
block wrong or there's a bug with subruledefs
.
Your example assembles just fine, but this slightly modified version doesn't:
#subruledef REG {
r0 => 0
}
#ruledef {
loadi {reg: REG}, {val} => 13`4 @ reg`3 @ val
zero {reg: REG} => asm { loadi reg, 0x0 }
}
zero r0
Output:
error: failed to resolve instruction
--> temp.asm:8:1:
6 | zero {reg: REG} => asm { loadi reg, 0x0 }
7 | }
8 | zero r0
| ^^^^^^^
9 |
=== error: no match for instruction found
--> temp.asm:6:28:
4 | #ruledef {
5 | loadi {reg: REG}, {val} => 13`4 @ reg`3 @ val
6 | zero {reg: REG} => asm { loadi reg, 0x0 }
| ^^^^^^^^^^^^^^
7 | }
8 | zero r0
Edit: actually the fix is this:
#subruledef REG {
r0 => 0
}
#ruledef {
loadi {reg}, {val:s9} => 13`4 @ reg`3 @ val
zero {reg:REG} => asm { loadi reg, 0x0 }
}
zero r0
Seems like there a potentially a bug with the subruledef in an asm block.
Also thanks for this! I will make good use of it :-)
Yes, I also realised that removing the :REG
fixes the issue, but in that case the original instruction (in this case loadi
) would no longer work.
Hmm, I knew I was gonna miss something in my quick hacking-away session! I've gotta think on how I'll go about passing syntax tokens into the inner asm block, as opposed to passing evaluated arguments. This might actually turn the feature into a non-hygienic macro system, so I'll need everyone's input when it's done. Everyone's feedback is always invaluable to me!
Alright... I managed to do it. In v0.11.7, you can now have token substitutions in asm blocks! I'm not totally satisfied with the non-hygienics or the juggling that's going on in the code, but it should work! The syntax is:
#subruledef reg
{
r0 => 0xaa
}
#ruledef
{
loadi {r: reg}, {val} => r`8 @ val
zero {r: reg} => asm { loadi {r}, 0x0 }
}
zero r0
The only difference here is that the argument is enclosed in braces { }
inside the asm block, to trigger token substitution. It'll be replaced with the r0
token used in the invocation site.
Thank you for your hard work! It works perfectly now! I have a couple of suggestions:
#ruledef {
macro => asm { macro }
}
macro
For me it makes sense to use { }
on the parameters that get passed to the instruction in the asm
block, but currently they're only used if the parameter is part of a subruledef
(not for immediates). I believe it makes more sense if all parameters use { }
, both subruledefs
and constants.
So, for example, here's some code that currently works fine:
#subruledef REG {
r0 => 0
}
#ruledef {
loadi {reg: REG}, {val: u8} => 13`4 @ reg`3 @ val
macro1 {reg: REG} => asm { loadi {reg}, 0x0 }
macro2 {val: u8} => asm { loadi r0, val }
}
macro1 r0
macro2 123
But I think it makes more sense this way:
#subruledef REG {
r0 => 0
}
#ruledef {
loadi {reg: REG}, {val: u8} => 13`4 @ reg`3 @ val
macro1 {reg: REG} => asm { loadi {reg}, 0x0 }
macro2 {val: u8} => asm { loadi r0, {val} } ; Notice the { } around val
}
macro1 r0
macro2 123
What do you think?
An easy way to forbid cycles is to have a maximum substitution depth, which also handles a bunch of other related issues, so it's what I'd probably go with.
Why can't you forbid the user to expand not yet defined macros or recursive macros? This would guarantee termination.
Why can't you forbid the user to expand not yet defined macros or recursive macros? This would guarantee termination.
Recursive behavior and deep substitution has it's genuine uses, so forbidding it would be a bit heavy handed to start.
Why can't you forbid the user to expand not yet defined macros or recursive macros? This would guarantee termination.
Recursive behavior and deep substitution has it's genuine uses, so forbidding it would be a bit heavy handed to start.
While true, when i had to make a similar decision i took the easy route, not allowing recursion of any kind, as i would end up with another full blown turing complete language. If deep recursion is needed, maybe it would be better to use the assembler as a library and setup it from a real and well mantained programming language. Of course hlorenzi is free to tackle the problem in another way on his own macro language.
@p-rivero If we use { }
for immediate arguments, would they also be token-substituted, or still passed by value? Passing by value would be more hygienic, which is what the non-enclosed version does. Do you just wish the syntax would be more consistent?
@hlorenzi Yes, it's just syntax suggestion. I still haven't had time to test it much, but it seems to me that it works just fine being passed by value. (Also, this is just my personal opinion. If I'm the only one that thinks this then there's no reason to change the syntax)
You know what, I don't like what I said, I hope you don't mind if I take it back.
For my use cases, it doesn't matter to me if the macros are hygienic or not because I am not maintaining thousands of lines of them. But it seems like that's important and I can totally respect that.
I like the implementation, it is better than the C pre-processor I have been using and I will certainly switch. A simple and consistent syntax are worthy goals, but it fits what I need: I can reduce duplication, I can have a set of macros for each instruction format, I can implement pseudoinstructions in terms of other instructions, and I can verify the assertions just a few times rather than all over the place.
It will clean up my code quite a lot and I am pretty excited about that, so thank you! Thank you for spending your free time making such an awesome and useful tool!
First: awesome project! This is so helpful for writing terse assemblers.
@tchebb and I are writing a ruleset for the UM (IFCP 2006) and we would like to be able to define macros, for example:
But this does not seem to be supported.