Open Mause opened 11 years ago
/me whistles
Looks like its going to be based on some pseudocode for a two pass assembler... pretty much the only documentation I could find
At this stage I'm not sure whether or not we should include preprocessing (in the macros and includes sense) in the assembler. I can think of several reasons why and why not:
pyexpander
, textassembler
, etc.
.define jmp set pc,
. .ORG
anyway, then it might not be much work to expand that to include a limited macro engine like most assemblers. On the other hand, supporting the syntax .ORG 0x1234
as a directive to get sent to the linker while also supporting the syntax .DEFINE JMP SET PC,
will almost inevitably make the assembler more complicated - and as a result, more bug-prone.We can always provide a --m4
option, after all, which would pre-process the input with M4. Alternatively, we could provide a --no-m4
option if we wanted to make preprocessing with M4 the default.
Feel free to replace all instances of 'M4' in the above text with a preprocessor of your choice if you really loathe M4.
Whether or not we include a preprocessor in the distribution of Jupiter, the pipeline will presumably look something like this:
<text.asm> -> [preprocessor] => [lexer] => [parser] => [code generator] -> <object.o>
We need to define the interface between these stages properly, including how this will be extensible in the future and now: opcodes may be added, but we also want to support multiple variants.
We need to finalise an appropriate object code format. Currently we have asteroid, but it's extremely immature and doesn't support much. We need the object code format to support pretty much everything a modern object code/executable format supports: sections, symbols, debugging information, relocatable object code, shared libraries, core dumps, etc.
Should we support only DCPU-16 and DCPU-16 derivatives? That's the 'core mission' of the project, should we go beyond it? Should we include enough abstraction to allow any sort of target? What about assembly language variants? Should we include support for optimisation phases, or should that be left to compilers and higher-level programs?
Extensibility is good, but too much extensibility isn't. Compare git
and bzr
. Cloning the GNU Emacs bzr
repository took HOURS when I did it the other day. In comparison it took 10 minutes to clone the GNU Emacs git
repository. Sure, bzr
has about 3 abstraction layers so you could completely rework the underlying layers and everything else would still work, but as a result it's really slow. In comparison git
doesn't really abstract anything from the user except through the porcelain commands - an intentionally leaky abstractions. Linus chose the core architecture from day one, and he doesn't need an abstraction layer around it because it doesn't need to change in the future. The index, the object store, SHA-1 hashes, etc.: if you changed them it wouldn't be git
anyway, so why abstract them away?
At the same time we want some level of extensibility, because we will want to extend it.
At the moment, if you want to an a new opcode (SET
, ADD
), you have to add its implementation to opcodes.hpp
and opcodes.cpp
. You then need to add handling code to three functions in assembler.cpp
. Not really what i would call extendable.
Not to mention adding the handlers to the parser, some of which should probably be refactored into a macro
inb4 macros are evil. IMO dynamic_cast<>
is more evil than macros.
Ideally if you wanted to add a new opcode, you would only have to add it to three things:
unary_opcode
or binary_opcode
) in one place, with no duplication.ooooo
.Ideally, you would be able to write "ADD=1" somewhere and have this all take care of itself.
(just trying to get some conversation started here)
@r4d2 ?