Closed ISSOtm closed 3 years ago
This is actually progressing—who would've guessed.
I have currently the following left to implement (at least, that I'm aware of):
We do have a problem, though. Some features need to be axed in order for this to be possible..!
OPT
b and g, and relevant CLI flags. Those relied on modifying the lexer's algorithm at run-time, which I don't know how to do with flex, if at all possible.EQUS
(EVAL
?).YY_USER_ACTION
.Side note: if the above two OPT
s are removed, this only leaves OPT p
, which I believe is made redundant by ds cnt, val
. So then OPT
could probably be removed entirely.
How will this affect existing code? Can you give examples of what won't work afterwards?
How this affects existing code is the question I always ask myself about each feature change. We don't have any usage statistics beyond pret's disassemblies, and this doesn't even include any hacks based on them.
OPT b
in a single place to "draw" some graphic using bits. It has been suggested to shim around this by supporting two binary / "gfx" digit sets, the current default and the one presented in the documentation.{sym}
is strictly equivalent to "{sym}"
, except in macro args (because). I discuss my stance on implementing this below.The reason why I do not want to make "naked" interpolations work like macro args is that parsing such interpolations is very complex, especially as they can be nested. Thus, it's OK to handle inside a string, since we're already working with a blob of data, but outside of it, it's expected to span multiple tokens (like macro args).
Macro args are not handled in the lexer rules proper, since they happen beyond the concept of tokens; instead, the code responsible for filling the lexer's text buffer (YY_INPUT
) is overridden to look for macro args, and "paste" their contents in. The complication is that flex puts a cap on the amount of characters that it expects from the function, so the entire contents can't be plainly dumped.
To avoid unnecessary copies/allocations, and thus keep the lexer—one of the two most performance-critical pieces of RGBASM—efficient, additional logic is present to "spread" the pasting across buffer refills.
The problem that "naked" symbol interpolations introduce is that they are really complicated to process. The hardest part is that they can be nested, which opens the can of worms of where to write the expanded results to, and so on. (It worked in the old lexer because the target buffer was large)
The only issue I see is that some auto-generation tools (particularly sdcc) generate an .optsdcc -mgbz80
.
This is irrelevant, because SDCC uses a custom weird assembler as its back-end instead of RGBDS.
@ISSOtm And it also has an option to generate RGBDS assembly, which is where I found that example.
But that wouldn't be valid RGBDS syntax; where is the code generating that?
My best guess would be in sdcc-4.0.0/sdas/asgb/gbpst.c
, and then maybe sdcc-4.0.0/src/z80/mappings.i
.
The lexer is currently a 1 kloc (+ 800 kloc if you count
globlex
) monstrosity of a modified auto-generated file. This should really be changed back to a flex definition somehow. The modifications made to it might require a custom program skeleton (probably, I think, but eh), but that would be certainly much better than what we currently have.