Closed deinonychus closed 3 months ago
I haven't used segments much in Ophis, so I can't say offhand what the expected behaviour is here, but it does seem like it deserves a testcase at least.
@deinonychus, if you add more .checkpc
directives to your example source (for example, after B1
, testbyte
, and TESTARRAY2
,) do they pass/fail as you would expect them to?
As a matter of fact only two of these three .checkpc
commands throw errors:
after B1
: .checkpc assertion failed: $202 > $ac
after TESTARRAY2
: .checkpc assertion failed: $208 > $ac
.
This means the 2nd .text zp
segment is actually being observed internally, but obviously not taken care of when the binary output gets processed.
OK, I see the problem (I think.) Consider just this part of the code:
.text
.advance $0200
Just before this, the (virtual) PC is at $0003
. The .text
switches to the unnamed text segment; the $0003
is stashed away and the PC is set to $0000
. Then .advance
advances to $0200
from $0000
-- which is a distance of $0200
-- so it writes 512 zero bytes... which is 3 too many.
This is all happening in Assembler.visitAdvance
in src/Ophis/Passes.py
.
The root problem seems to be that the Assembler
pass isn't treating the PC as being relative to the current segment. But it inherits visitTextSegment
from Pass
, which does treat it that way. It should probably override that method -- or track the PC in a different way. (Sorry if this is not a very coherent analysis -- it's a bit late where I am. I'll try to look at it again soon.)
Attempt at a better explanation:
The Assembler
pass outputs bytes sequentially, and the intermediate representation is not re-ordered before the Assembler
pass. Therefore the binary output always follows the order of the input source code. Which means the following does not do what one would expect either:
.text foo
.org $0001
A: .byte $02
.text bar
.org $0000
B: .byte $01
In the output file, the $02
comes before the $01
.
So Ophis doesn't really seem to support multiple text segments, except for trivial cases where there would be little point in using them anyway.
I think the simplest way to fix this would be to have the Assembler
pass generate its bytes into a random access buffer instead of a sequential stream, and then dump that buffer to the output file at the end. This would be a fairly major change, but if there is a simpler way to support this feature, I can't see it.
It is intentional that the output is sequential, though this is admittedly strange interaction between two features. The assembler puts out bytes sequentially for the express purpose of being able to directly produce output runnable by emulators without a platform-aware linker.
The C64 has a trivial example - the .word $801 that starts each example actually has no representation in the memory dump. The UNIF example has considerably more metadata within it, scattered throughout the file.
The relationship between the mapfile and the output is a little strained when producing single-file output representing bank-switched program code (common for cartridge-based systems) or one-filed code that copies pieces of itself into different locations at start (common for tape- and disk-based systems). It is completely legal - and sometimes even desirable - to drop .org statements in the middle of the .text statement that produce sequential output whose apparent address for the purpose to jumps and such is different even though the resulting code is straight-line.
I don't have good bankswitching or relocation examples in the documentation yet, however. I'm going to keep this issue open as a suggestion to create one.
(Code relocation in particular could be made easier as well, but that's not the purpose of this issue)
The change suggested in the previous commment is not only a major but a breaking change - it would require most of the sample code to be paired with a linker script to produce the final binary. I'm loath to actually do that, but if it turns out to be a necessary change, it would be part of an Ophis 3 or possibly a full product rename.
OK -- if the current behaviour is the intended behaviour, all I can say is that I certainly did not get that impression when re-reading the "General Segmentation Simulation" section of the manual last night; my conception of how it was supposed to work after reading it was the same as @deinonychus's.
I'd suggest adding some notes in that section about how interleaving multiple named text segments does not create a flat memory file, but rather one that is intended to be consumed by a linker (and that the linker will need to be informed of the structure of the file, either through explicit inclusion of metadata markers in the output, or by reading the map file, or some other method.)
fwiw I did think of a simpler way to get the "overlay the segments" behaviour that myself and @deinonychus were thinking of: run the Assembler
pass multiple times, each time for a single text segment only, and produce multiple output files. Then merge those output files. (Not spectacularly efficient of course, but simple, and less invasive to the current behaviour.)
I believe if there are multiple .data
segments which are switched back and forth like in the "Appendix B General Segmentation Simulation" example everything is fine since it doesn't result in any output.
I was lead by the introduction of "Advanced Memory Segments" in chapter 7 where the need for "two separate program counters" is identified. This gave me the perception that multiple independent PCs are maintained and every time I change segments the assembler will continue counting up the PC of the corresponding one. For me it is a bummer that the output routine unfortunately doesn't stick to this rule...
I am writing a main program which calls different subroutines kept in separate files. These subs should include their own definitions of variables both inside and outside zp. I need them to reside in .text
segments for the output should be a single gap-less file. This is because as a second step I convert it to an ASCII text file which I can download to an AIM65 over serial line. (My conversion tool takes care of avoiding the stack page.) My workaround for now is of course declaring all zp variables in the main program.
I second cpressey's suggestion to clarify this in the manual in some way.
@deinonychus If this behaviour is something you really need, I thought of an even less invasive way to accomplish it, if you're interested:
visitTextSegment
in the Assembler
pass, have it write out the current PC and the current file position to a log file of some kind.Neither part seems very difficult, and I'm tempted to implement it myself because it seems like it would be kind of fun (the script would be, in essence, a very crude linker, and not limited to Ophis or 6502) but I have rather a lot of other things on my TODO list so it's unlikely I'll find the time.
Thank you for your workaround suggestion!
I've been thinking this over some, and I think the solution I'll take is twofold:
Ophis 3 is still very much a hypothetical at this point, existing mostly as a collection of it-would-be-nice-if plans and lists of mistakes in the Ophis 2 design (like how scoped labels work) where fixing them would complicate backwards compatibility.
Closing this as not directly part of any particular release, though related stuff may end up in the later wishlist.
When trying to interweave different .text segments, the Ophis object file output doesn't seem quite right, in contrast what the map file states. The latter sticks to the given source code. Example:
Code:
Mapfile:
Ophis output:
Is there an explanation for this behaviour, for I would have expected this binary output:
Thank you.