Closed nww02 closed 4 years ago
This feature sounds easy, but raises many questions that should be answered during the design phase:
I went through similar questions many times when designing the compiler.
Hmm.. Yes, relative addressing would be tricky, especially on self-modifying code.. hmmm ok.. sips extra strong coffee for the brain boost
I guess in a simple implementation, to prevent the code breaking, the RELATIVE locations of Functions would need to be preserved, as you say, just in case someone does a "jr" between them, but that poses the problems you describe.
I guess the simplest method is to throw the problem at the coder and say "Don't use self-modifying code, or relative offsets into a .function block. If you attempt to do relative addressing into or out of a function block, it might not work unless you're clever. A function must be self contained. The compiler will let you do it, but your mileage may vary. same as ld (00),0 .. it's syntactically valid... just unwise ;-)"...
I suppose a low optimisation solution might be to include the whole file if a function within it is used. But that defeats some of the purpose of the .function block. It'd let people have a large library, but might include sections of it unnecessarily unless every function is in its own file... Could still be useful, sorta.
Slightly harder might be to check all jr, and ld (nn), or ld ,(nn)'s as you're going through and if any fall within a function block to throw a warning that a relative address is crossing a function boundary.
Possibly slightly more elegant might be to replace any targetted memory locations with internal references, do the .function pass, then go through and see where all the references ended up being relocated to, and then changing the placeholders accordingly, same as with "$". For jr's that'd mean re-inserting the functions back at the same point in the code so that they don't wander too far away, I suppose.
But none of these help if the coder is using a dynamic jump table... hmmm. tricky..
For the contradictions, You could also kick the problem back at the coder as it's a matter of optimisation. If the first symbol is referenced in anything other than a comment, I'd always err on the side of caution and compile the .function section in. As the compiler can't know the runtime state of variables, include the function as its symbol is referenced :) If the code becomes more bloated, then it's up to the coder to fix their redundant reference :)
For a first attempt at .functions I'd try to keep the functions at the same place in the code, just to be on the safe side..... and go with "Just tell people not to use jr into, out of, or between function blocks, and if you need self-modifying code ONLY do so with relative offsets inside the same function block". Later, I'd add warnings if they attempt it, but wouldn't bother trying to optimise it much further unless people get really vocal.
:)
:-) I think about the ways it can be solved in a descently.
@nww02, In the last three weeks, I carried out a couple of experiments, but did not manage to find a descent solution that is based only an the Z80 Assembler. I tried several alternative solutions, but each of them had their setbacks. So, I decided to allow compiling and linking libraries. By my current desing, you can sign a Z80 code file as a library, and comile it to a specific format. In a Z80 assembly file, you can include a library. At the and of the compilation, the included library will be merged with your code. I'm still working on the details, as I need to implement a few Z80 Assembly feature that makes it straightforward to carry out the linking. It's a lot of work to do, so I definitely need weeks to implement this set of features.
I don't know how good the optimiser is, but was wondering if something like this might be a simple method for letting people create libraries:-
Create a directive .FUNCTION ... .ENDFUNCTION which wraps around code. The code within the block is by definition not included in the output UNLESS one or more labels within the block are referenced from outside the block.
By doing this, you can have large library source files but the output is much tighter before the optimiser even begins because only the actually referenced functions are included.
Is something like this already implemented in the compiler? Is it done automatically? With the below, the add_b_c code would make it into the output files, but add_d_e would not as it is not referenced outside the block, only inside.
The lexer/compiler would need to keep going back and inserting function blocks to satisfy unresolved symbols until it can't locate any more, if (for example) functions call other functions in a cascade. Unless the compiler pre-calculates the dependency graph of function blocks, I suppose.