z00m128 / sjasmplus

Command-line cross-compiler of assembly language for Z80 CPU.
http://z00m128.github.io/sjasmplus/
BSD 3-Clause "New" or "Revised" License
374 stars 52 forks source link

LABELSLIST behavior #111

Closed ammehet closed 4 years ago

ammehet commented 4 years ago

Why does LABELSLIST simply truncate label addresses to #0000..#3FFF range regardless of PAGE? Isn't it more consistent to process them according to current MMU settings? Is personal ORG for labels the only way to avoid this behavior?

For example,

    device zxspectrum48
ADDR_0     equ 0        ; expecting :0000
ADDR_8000  equ #8000    ; expecting 02:0000
ADDR_C000  equ #C000    ; expecting 00:0000

    device zxspectrum128
ADDR0_8000 equ #8000    ; expecting 02:0000
ADDR0_0    equ 0        ; expecting :0000
ADDR0_C000 equ #C000    ; expecting 00:0000

    page 3

ADDR3_8000 equ #8000    ; expecting 02:0000
ADDR3_0    equ 0        ; expecting :0000
ADDR3_C000 equ #C000    ; expecting 03:0000

resulting in

00:0000 ADDR_0      ; expected :0000
00:0000 ADDR_8000   ; expected 02:0000
00:0000 ADDR_C000   ; expected 00:0000

07:0000 ADDR0_8000  ; expected 02:0000
07:0000 ADDR0_0     ; expected :0000
07:0000 ADDR0_C000  ; expected 00:0000

07:0000 ADDR3_8000  ; expected 02:0000
07:0000 ADDR3_0     ; expected :0000
07:0000 ADDR3_C000  ; expected 03:0000
ped7g commented 4 years ago

The equ is more like "define constant", not trying to guess if you are defining memory address $8000 or value $8000.

To define memory-type of label do the classic memory definition:

    device zxspectrum128
    mmu 2, 2, $8000     ; map page 2 to $8000..BFFF, also org $8000 (v1.15.0 feature of MMU)
label:

I don't think it's more consistent to process equ according to current memory mapping (not by default), as equ is not supposed to hold code/data address often. Maybe some kind of new extra syntax for equ like labelPage2 equ $8000,2 would be handy? (just new idea I got right now, I didn't check if it's possible to add this)

In case like this, it helps to explain why you have equ labels instead of regular code, so I can think about your use case in future. Maybe you have good reasons why to not define the labels in the code while you are building it and this feature is really missing, but I usually define all labels after particular mmu + org while assembling the code itself, so I'm not aware of any common situation having this sort of a problem.

ammehet commented 4 years ago

Yes, I do practice to include something standalone (as it doesn't intersect the code), like

    org #0000   ; Constants and ROM pointers
INK
.BLACK          equ #00
.BLUE           equ #01
TRDOS           equ #3D13

    org #4000   ; Addresses
SCREEN          equ #4000           ; main screen
ATTR            equ SCREEN+#1800    ; attributes
BANK            equ #5B5C           ; (23388) system variable that holds the last value output to 7FFDh
LAST_K          equ #5C08           ; (23560) Stores newly pressed key.

    org #C000   ; Buffers and variables
                ds  #2F00           ; code
CAT_BUF         ds  2048            ; #0800 (2048)  disk catalogue buffer

and labelslist looks just fine then. But if I place SCREEN EQU #4000 under org #8000, I will get some odd. It passes to the code still as #4000, but labelslist will handle it as #8000 (02:0000).

As for me, there's some inconsistency between equ and labelslist. When assembled, all the equ's regardless of «personal org» are accurately in place with the code just by their values. But labelslist truncates both the «address» and the «page», so constant lose its value and label appears at wrong page in debugger. It seems that this behavior makes labelslist almost useless in such cases.

I think it makes no sense to stick equ to a specific page by extended syntax, while during the assembly process its place is uniquely determined and injected into the code perfectly even without or with «wrong» org. And labelslist should do just the same, but it doesn't.

ped7g commented 4 years ago

I agree the current status is not good, but I'm still a bit confused how to fix it, so I will keep asking you questions to better understand the problem... :)

Let's clear one thing, labelslist is specifically targetting Unreal emulator, so whatever illogical inconsistent behaviour is needed to support Unreal, I'm willing to add it to sjasmplus. But I'm not Using unreal myself, so I can't easily try/verify this.

You are using labelslist with Unreal debugger? And it works ok for regular labels?

About specific page for equ - I still think this is kinda needed, because let's imagine code like this.

code_routine    equ $c000
enemy_hp      equ $c000  ; just hit-points value, not memory address
  ...
  ld  de,enemy_hp
  call code_routine
  ...
  org $c000
  ex de,hl

What do you expect from debugger here? Probably ld de,$c000 : call code_routine and when at $c000, the code_routine: ex de,hl, right?

But if the export file contains both code_routine and enemy_hp equal to $c000, how should the debugger know that the ex de,hl is at "code_routine", but not at "enemy_hp"?

I don't know how Unreal does use the file, but the general equ values probably shouldn't be exported at all, if the labelslist file is meant as memory-addresses, because then things like enemy_hp don't belong there.

Having extra syntax for equ to specify explicit page of label would resolve this, giving the symbol memory page => giving info to sjasmplus that the symbol is not general value, but memory address => exporting it to labelslist (and removing other equ without page info from labelslist) (or adding new keyword like memequ or label for these)

(the sjasmplus already knows which symbol is defined by equ/defl/= and which symbol is defined as simple label assigning address of current pointer $, this info is already used in export files for different debuggers like cspectmap for #CSpect emulator or SLD data, so I can filter out all general equ from the labelslist)

The syntax can be even like ADDRESS3_C000 equ $C000, ;comment leaving the page number empty to let sjasmplus know it should search current memory mapping and assign the page of such address in current setup (like page 3) (the sjasmplus has also $$ operator to get current page number, but that is for $ address, i.e. ADR equ $8000,$$ in the page3 + org c000 area would still define ADR as page 3 label.

And the regular equ like INK.BLACK equ $00 should be then removed from labelslist export completely, as that is not useful for anything in the Unreal debugger? It may just confuse him when you look at ROM address $0000?

The "correct" current way for your example file would be more like:

  ; org shouldn't matter for `equ` of constants
INK
.BLACK          equ #00
.BLUE           equ #01

    org #3D13   ; ROM pointers
TRDOS:

    org #4000   ; Addresses
SCREEN:      ds #1800     ; main screen
ATTR:           ds #300     ; attributes
   org #5B5C
BANK:            ; (23388) system variable that holds the last value output to 7FFDh
   org #5C08
LAST_K:        ; (23560) Stores newly pressed key.

And this will still produce some random label INK with some "current address and page" because the way how you are using it, interesting use case.

So the org + label seems very cumbersome to me, but just adding current page-mapping to every equ feels wrong too (in my own projects I have very few memory addresses defined by equ, 99% of my equ are general values, not memory addresses).

So I'm not sure how to fix this, any further thought/examples? And thank you for the input so far!

ammehet commented 4 years ago

I'm using Xpeccy, it works with Unreal's labelslist format. As of your example with code_routine, both of equ's would be :0000 without «page», but compiled as #C000 as well. In this case debugger will not show any label. This is expected with this contents of labelslist.

With this example, I expect from labelslist to use logic like «hmmm, at this (compile) time #C000 must be at slot 3 and it is mapped to page 0, so generate 00:0000». So I modified generated labelslist manually following this logic to look like this:

00:0000 code_routine
00:0000 enemy_hp

The result looks good enough for me: image It is quite obvious that debugger will display first defined name, so it's of programmer's choice which one to define first. And if I type in any of «code_routine» or «enemy_hp» in debugger's disasm window, it navigates me directly to #C000.

Consider one more example of using equ outside the code:

    org #8000
some_routine
    ld   a,#33
    ld   (VALUE),a
    ld   b,3
.loop
    ld   c,#00      ; ld c,value
VALUE equ $-1
    ld   a,b
    ld   (VALUE),a
    djnz .loop
    ret

This will expectedly break local naming and produce

test.asm(13): error: Label not found: VALUE.loop
test.asm(13): error: [DJNZ] Target out of range (-32783)

But external definition works quite well:

    org #8000
some_routine
    ld   a,#33
    ld   (VALUE),a
    ld   b,3
.loop
    ld   c,#00      ; ld c,value
    ld   a,b
    ld   (VALUE),a
    djnz .loop
    ret

VALUE equ some_routine.loop+1

image

If for some reason I put definition of VALUE before org, labelslist generates :0008 VALUEinstead of 02:0008 VALUE and VALUE becomes unusable in debugger as it is not displayed. There may be some reason in this behavior as org defaults to 0, but label equ another_label should define label in the same page as another_label regardless of org. I think labelslist behavior should follow compiler's.

Another cons of extended syntax is that if I decide to move part of code to another page, I will have to change also all of the external definitions instead of simply change page in original code block.

(the sjasmplus already knows which symbol is defined by equ/defl/= and which symbol is defined as simple label assigning address of current pointer $, this info is already used in export files for different debuggers like cspectmap for #CSpect emulator or SLD data, so I can filter out all general equ from the labelslist)

so why not to use known values as is instead of truncating the page part?

And this will still produce some random label INK with some "current address and page" because the way how you are using it, interesting use case.

I specifically use org #0000 for constants to avoid random definitions that might be included before. So that, labelslist process them with «no page» (:001A enemy_hp) and they are of no use (and not displayed) in debugger in general, as of possible multiple definitions of same value with different names. It's ok. This is just for source readability, consider ld a,PAPER.BLUE|INK.WHITE instead of ld a,#0F.

My suggestion is to use value, determined by the compiler, without truncating the page part. And there's no need to remove «unpaged» entries (e.g. defined with org 0) as debugger does not display them.

ped7g commented 4 years ago

So if you edit the output file and put enemy_hp first, the Xpeccy will show:

    ld de,enemy_hp
enemy_hp   ex de,hl

I don't like this, it just confirms my worries the Unreal format is very limited (also it doesn't account for advanced ZX models like ZX128+3 in "allram" mode or ZX Next where code in $0000..3FFF area is quite common).

I did check the sjasmplus source code and it seems like I actually do have page number of equ, so I can export that... maybe I will modify it as you wish, and wait until somebody else will report it as bug... :)


But this doesn't resolve the issue in general way. And I don't see any perfect solution either.

Even the new SLD data doesn't cover use case when equ is used to define memory address, this is shortcoming I was not aware of before.

ped7g commented 4 years ago

This is somewhat fixed now.


One issue is: do NOT mix ZXSPECTRUM48 and ZXSPECTRUM128 (or more than 128) devices together in the same source code, if you need correct labels in the Unreal export file.

The ZX48 memory mapping is { 0, 1, 2, 3 }, while Unreal export creates { ROM, 5, 2, 0 } even for ZX48 by translating the page numbers during export (if current active device is ZX48). So when you mix devices and do LABELSLIST, you will get either ZX128 pagenums wrongly translated, or ZX48 untranslated (depending which device is active at the end of the source)


Second issue is: other devices like ZX128/ZX256/... and ZXNEXT map "ROM" as page 7. So in the modified example of yours: tests/devices/extra/Issue111_LABELSLIST_EQU.asm The result is:

:1234 NONE_EQU
:1EF0 NONE_EQU2
02:0000 ADDR0_8000
07:0000 ADDR0_0
00:0000 ADDR0_C000
02:0000 ADDR3_8000
07:0000 ADDR3_0
03:0000 ADDR3_C000
:0000 OTHER_EQU
07:0000 PagesTab
03:0000 ORG_ADR
02:0000 ORG_ADR_EQU
:0000 OTHER_EQU2

Note the "07:" in front of "ROM" labels, as sjasmplus doesn't have "ROM" page yet, so there must be some RAM page mapped into region $0000..$3FFF.

.... I'm thinking about adding "ROM" page to the total pool of memory available for PAGE directive (maybe like PAGE -1), but that's like "enhancement" stuff, for new ticket, I will probably add one today, or add it yourself if I get distracted.

Anyway, the current state is like "fixed" and I'm closing this issue, feel free to comment/reopen if you think this is mistake. But I don't want to spend too much time on the unreal dump, as it's the most limited option currently available in sjasmplus.

Would be better if Xpeccy would support at least CSpect map files, or ideally SLD files, both options providing more info than the LABELSLIST (or maybe Xpeccy has it's own native format supporting better information about symbols, in such case I may try to add that to sjasmplus).