Megatokio / zasm

Z80 / 8080 / Z180 assembler (for unix-style OS)
https://k1.spdns.de/Develop/Projects/zasm/Distributions/
BSD 2-Clause "Simplified" License
79 stars 19 forks source link

ORG cannot be set back to lower address #33

Closed pdr0663 closed 8 months ago

pdr0663 commented 10 months ago

Hi there.

I'm trying to port a Forth implementation to the ZX81. The Forth names are stored in a linked list which grows downwards in memory, but is linked forwards, so the last item entered is the lowest in memory, and the first item is the highest.

Im using #code NameSpace to identify this part of the code, and trying to use .phase to manipulate the current code location to begin each entry in the correct place. The code uses some variables to calculate the correct starting point of each entry, and .phase is used to effect the new starting point.

The problem is, it seems that the assembler is ignoring the .phase instruction. Is it a feature of the assembler that the code origin must always increase, and never decrease using .phase?

Paul

Megatokio commented 10 months ago

Hi Paul, your understanding of .phase is wrong. See documentatioon for .phase. It changes the logical code address, not where it is stored. A possible way to go for you is to use macros to create labels with a serial number so you can link them. see .macro, esp. the last example, and move the .org. But storing code backwards is a challenge. Kio !

Megatokio commented 10 months ago

Hi Paul, thinking over it, either i have misunderstood your approach or it is a bad idea. I suggest you write the words in sequence as they are stored in the rom, starting with the latest word. Then you have one block of code which you want to end at the predefined top of the library. Now the only challenge is to get the start address for this block, because you end up subtracting two not-yet-defined labels, though a human can tell that the difference could be calculated. I think the best is to put it in a #code segment, as you already do. Then you can get the total size from a label defined by the assembler: NameSpace_size which you can use in the calculation of the segment start address. please tell me if this worked for you. Kio !

pdr0663 commented 10 months ago

Kio,

Firstly, thanks for your prompt and detailed reply.

The code is a legacy of eForth, and I'm reluctant to change it. Here is the macro to create the header for each Forth word. It contains the code I mentioned in my question above:

` CODE MACRO LEX,NAME,LABEL LABEL: ;;assembly label CODE_ = $$ ;;save code pointer

code NameSpace

    _LEN    = (LEX AND 01FH)/CELLL  ;;string cell count, round down
    NAME_   = NAME_-((_LEN+3)*CELLL);;cell boundary, downward
    .phase  NAME_                   ;;set name pointer
    DW      CODE_,LINK_             ;;token pointer and link
    LINK_   = $$                    ;;link points to a name string
    DB      LEX,NAME                ;;name string

code CodeSpace

    ;.phase  CODE_                  ;;restore code pointer
    ENDM

`

CodeSpace and NameSpace are initialized elsewhere with another #code statement, as the intitial location, before calling this macro. As mentioned above, the code and names live in separate areas of memory, and grow towards each other, like two opposed stacks. The code grows upwards in the normal sense, and the names grow downwards from a starting point higher in memory. It is a little unusual in Forth, but it's an internal feature of eForth.

I'm not familiar with the concept of multiple #code and #data segments, although I see segments are a concept from the Unix C world. I do understand that concept. I have only ever used ORG as the means of setting the current location, and the code I'm porting was writted for Microsoft's MASM. I chose ZASM as I could run under Windows (with CygWin), and it supports macros in a similar syntax to MASM. MASM allows ORG to change the current execution address at will, forwards or backwards without limitation, it seems.

Here is the original legacy code:

$CODE MACRO LEX,NAME,LABEL $ALIGN ;;force to cell boundary LABEL: ;;assembly label _CODE = $ ;;save code pointer _LEN = (LEX AND 01FH)/CELLL ;;string cell count, round down _NAME = _NAME-((_LEN+3)*CELLL) ;;cell boundary, downward ORG _NAME ;;set name pointer DW _CODE,_LINK ;;token pointer and link _LINK = $ ;;link points to a name string DB LEX,NAME ;;name string ORG _CODE ;;restore code pointer ENDM Sorry for the ugly code formatting, it was pasted from NotePad++ and seems to be line-terminated in the Windows way, but appears strangely in the <> section of this comment.

Note that ALIGN is used in the code, as this Z80 version was originally ported from an 8086 version. I understand ALIGN is redundant on the Z80, as loads can occur at any address, please correct me if I'm wrong.

Can I do something similar to the legacy code with ZASM, in a way you may know?

Paul

pdr0663 commented 10 months ago

And here's a file with the same code.... code.txt

Megatokio commented 10 months ago

What is the error with the original code? the only thing i can see is that ORG should not start in column 1. I assume that way it is interpreted as a label definition? this depends on --reqcolon setting. with this option i cannot see why it didn't work. Kio !

Megatokio commented 10 months ago

$ALIGN: this seems to be a macro. it depends on what this macro does.

pdr0663 commented 10 months ago

The problem is, ORG fails with valid addresses.

Here is the original code with minor edits to get it to assemble:

;; Initialize assembly variables

CELLL   = 2
_LINK   = 0                                     ;force a null link
_NAME   = $3FFF                                 ;initialize name pointer
_CODE   = 0                                     ;initialize code pointer
_USER   = 4*CELLL                               ;first user variable offset
CALLL   = $1234
LISTL   = $5678

;; Define assembly macros

;       Adjust an address to the next cell boundary.

ALIGN  MACRO
        ;EVEN                                    ;;for 16bit systems
        ENDM

;       Compile a code definition header.

CODE   MACRO   LEX,NAME,LABEL
        ALIGN                            ;;force to cell boundary
LABEL:                                    ;;assembly label
        _CODE   = $                       ;;save code pointer
        _LEN    = (LEX AND 01FH)/CELLL    ;;string cell count, round down
        _NAME   = _NAME-((_LEN+3)*CELLL)  ;;cell boundary, downward

        DW      _CODE   ; debug
        DW      _NAME   ; debug

ORG     _NAME                             ;;set name pointer
        DW       _CODE,_LINK              ;;token pointer and link
        _LINK   = $                       ;;link points to a name string

        DW      _LINK   ; debug

        DB      LEX,NAME                  ;;name string
ORG     _CODE                             ;;restore code pointer
        ENDM

;       Compile a colon definition header

COLON  MACRO   LEX,NAME,LABEL
        CODE   LEX,NAME,LABEL
        DW      CALLL                     ;;align to cell boundary******
        DW      LISTT                     ;;include CALL doLIST******
        ENDM

;       Compile a user variable header.

USER   MACRO   LEX,NAME,LABEL
        CODE   LEX,NAME,LABEL
        DW      CALLL                     ;;align to cell boundary******
        DW      LISTT                     ;;include CALL doLIST******
        DW      DOUSE,_USER               ;;followed by doUSER and offset
        _USER   = _USER+CELLL             ;;update user area offset
        ENDM

;       Compile an inline string.

_D_      MACRO   FUNCT,STRNG
        DW      FUNCT                     ;;function
        _LEN    = $                       ;;save address of count byte
        DB      0,STRNG                   ;;count byte and string
        _CODE   = $                       ;;save code pointer

        DW      _CODE   ; debug

ORG     _LEN                              ;;point to count byte
        DB      _CODE-_LEN-1              ;;set count
ORG     _CODE                             ;;restore code pointer
        ALIGN
        ENDM

ORG     0

        CODE    4, "Fred", FRED

        CODE    3, "Joe", JOE

        _D_     StringHandler, "Hello World"

StringHandler:
        NOP

I've thrown in some DW to view the value of computed variables to demonstrate that the values are ok.

Here is the output from the assembler:

C:\Users\Paul.Riley\Downloads\zasm>zasm -uwy --reqcolon efz80_test.asm

in file efz80_test.asm:
80: ORG     _CODE                             ;;restore code pointer
                 ^ size: value not in range[0 .. 65536]
82: ORG     _NAME                             ;;set name pointer
                 ^ size: value not in range[0 .. 65536]
82: ORG     _CODE                             ;;restore code pointer
                 ^ size: value not in range[0 .. 65536]
84: ORG     _LEN                              ;;point to count byte
                ^ size: value not in range[0 .. 65536]
84: ORG     _CODE                             ;;restore code pointer
                 ^ size: value not in range[0 .. 65536]
assembled file: efz80_test.asm
    136 lines, 1 pass, 0.0084 sec.
    5 errors

And here is the listing. You can see that the values of these variables are indeed valid (between 0 .. 65536) (do you mean 65535?).

                        ; --------------------------------------
                        ; zasm: assemble "efz80_test.asm"
                        ; opts: --reqcolon
                        ; date: 2023-11-08 09:42:34
                        ; --------------------------------------

.
.
.

0000:                   ORG     0

                                CODE    4, "Fred", FRED
                                ALIGN                            ;;force to cell boundary
                                ;EVEN                                    ;;for 16bit systems
0000:                   FRED:                                    ;;assembly label
                                _CODE   = $                       ;;save code pointer
                                _LEN    = (4 AND 01FH)/CELLL    ;;string cell count, round down
                                _NAME   = _NAME-((_LEN+3)*CELLL)  ;;cell boundary, downward

0000: 0000                      DW      _CODE   ; debug
0002: F53F                      DW      _NAME   ; debug

0004: FFFFFFFF          ORG     _NAME                             ;;set name pointer
0008: FF...             
3FF5: 00000000                  DW       _CODE,_LINK              ;;token pointer and link
                                _LINK   = $                       ;;link points to a name string

3FF9: F93F                      DW      _LINK   ; debug

3FFB: 04467265                  DB      4,"Fred"                  ;;name string
3FFF: 64                
                        ORG     _CODE                             ;;restore code pointer
***ERROR***                          ^ size: value not in range[0 .. 65536]

                                CODE    3, "Joe", JOE
                                ALIGN                            ;;force to cell boundary
                                ;EVEN                                    ;;for 16bit systems
4000:                   JOE:                                    ;;assembly label
                                _CODE   = $                       ;;save code pointer
                                _LEN    = (3 AND 01FH)/CELLL    ;;string cell count, round down
                                _NAME   = _NAME-((_LEN+3)*CELLL)  ;;cell boundary, downward

4000: 0040                      DW      _CODE   ; debug
4002: ED3F                      DW      _NAME   ; debug

                        ORG     _NAME                             ;;set name pointer
***ERROR***                          ^ size: value not in range[0 .. 65536]
4004: 0040F93F                  DW       _CODE,_LINK              ;;token pointer and link
                                _LINK   = $                       ;;link points to a name string

4008: 0840                      DW      _LINK   ; debug

400A: 034A6F65                  DB      3,"Joe"                  ;;name string
                        ORG     _CODE                             ;;restore code pointer
***ERROR***                          ^ size: value not in range[0 .. 65536]

                                _D_     StringHandler, "Hello World"
400E: 0000                      DW      StringHandler                     ;;function
                                _LEN    = $                       ;;save address of count byte
4010: 0048656C                  DB      0,"Hello World"                   ;;count byte and string
4014: 6C6F2057          
4018: 6F726C64          
                                _CODE   = $                       ;;save code pointer

401C: 1C40                      DW      _CODE   ; debug

                        ORG     _LEN                              ;;point to count byte
***ERROR***                         ^ size: value not in range[0 .. 65536]
401E: 0B                        DB      _CODE-_LEN-1              ;;set count
                        ORG     _CODE                             ;;restore code pointer
***ERROR***                          ^ size: value not in range[0 .. 65536]
                                ALIGN
                                ;EVEN                                    ;;for 16bit systems

401F:                   StringHandler:
401F: 00       [ 4]             NOP
pdr0663 commented 10 months ago

I like your assembler, and I very much want to integrate my code into your ZX81 tape template (.p). Can the template function with just ORGs or will I need to come to terms with #code and .phase to do it?

Paul

Megatokio commented 10 months ago

Hi Paul, now i see what the problem is. ORG is implemented by inserting space up to the new ORG. i didn't anticipate a use case where it is set backwards, because this normally results in overwriting already created code. setting it backwards results in inserting a negative number of spaces, which doesn't work. I'll work on a solution. Kio !

Megatokio commented 10 months ago

@ .p file: i think it can all go into one segment and you don't need .phase.

Megatokio commented 10 months ago

I have changed some code and think that setting ORG back to lower address is now possible. As i cannot compile it for Windows for myself, can you test it with the online assembler? k1.spdns.de/cgi-bin/zasm.cgi

If it works i will alert a helpful hand to create the Windows version. btw. can you compile the source? (branch 'allow_org_going_back') Kio !

Megatokio commented 10 months ago

can you test it with the online assembler? k1.spdns.de/cgi-bin/zasm.cgi thanks, Kio !

pdr0663 commented 10 months ago

Sorry for my slow reply.

I have used a workaround, suggest by you above. I've used serial numbered #code areas, which happily jump back and forth.

can you test it with the online assembler? k1.spdns.de/cgi-bin/zasm.cgi thanks, Kio !

Won't assemble, gives this result:

tempmem: single-threaded ../Source/listfile.cpp:425: assert failed: !segment \|\| !segment->size.is_valid() \|\| sourceline.bytecount <= 0x10000 aborted.
pdr0663 commented 10 months ago

I managed to work around the problem by avoiding ORGs and using serial #code labels.

I'm using the ZX81 "tape" template, and I now get this issue:

in file efz80-zx81.asm:
2682:         ; (RAMTOP)              End of memory (address of last byte (incl.))
      ^ segment SYSVARS: E_LINE must match ram end address $4074 (E_LINE=$5B5D)
assembled file: efz80-zx81.asm
    5398 lines, 6 passes, 0.1669 sec.
    1 error

Is the assembler aware of the ZX81 system variables, and is not happy with the value of E_LINE?

Paul

pdr0663 commented 10 months ago

I have changed some code and think that setting ORG back to lower address is now possible. As i cannot compile it for Windows for myself, can you test it with the online assembler? k1.spdns.de/cgi-bin/zasm.cgi

If it works i will alert a helpful hand to create the Windows version. btw. can you compile the source? (branch 'allow_org_going_back') Kio !

Thanks much appreciated. However, I'm using the ZX81 tape template, which uses #code for the various ZX81 spaces, and I assume I'll need to integrate into that using #code for my stuff. Correct me if I'm wrong.

Megatokio commented 10 months ago

Sorry for my slow reply. I have used a workaround, suggest by you above. I've used serial numbered #code areas, which happily jump back and forth.

can you test it with the online assembler? k1.spdns.de/cgi-bin/zasm.cgi thanks, Kio ! Won't assemble, gives this result:

tempmem: single-threaded ../Source/listfile.cpp:425: assert failed: !segment \|\| !segment->size.is_valid() \|\| sourceline.bytecount <= 0x10000 aborted.

ok, there is a test for the number of bytes added by one instruction for including in the listing, and setting the ORG back makes a negative number. even for positive offset the listing is a little bit silly. it should compact the hufe number of FFs. this can be fixed.

Megatokio commented 10 months ago

I managed to work around the problem by avoiding ORGs and using serial #code labels. I'm using the ZX81 "tape" template, and I now get this issue:

2682:         ; (RAMTOP)              End of memory (address of last byte (incl.))
      ^ segment SYSVARS: E_LINE must match ram end address $4074 (E_LINE=$5B5D)
assembled file: efz80-zx81.asm

Is the assembler aware of the ZX81 system variables, and is not happy with the value of E_LINE?

actually yes. by setting #target=p81 zasm knows about some peculiarities which it cares about. and without setting E_LINE to the correct address your program won't load successfully. This is required by the ZX81 tape loading routine, not by zasm.

Megatokio commented 10 months ago

Thanks much appreciated. However, I'm using the ZX81 tape template, which uses #code for the various ZX81 spaces, and I assume I'll need to integrate into that using #code for my stuff. Correct me if I'm wrong.

no, that is not required. you can just set #target or not even that. The template simply contains the definitions for the various system variables at the right place. Starting with ORG $4009 and jumping around with subsequent ORGs will be ok. segments are just for organizing.

Because the error is in the list file generation and i assume only if the object code is included, you can already use the online assembler without listing the object code. I'll fix the problem in the list file generation so you can do another test with object code. Kio !

pdr0663 commented 10 months ago

actually yes. by setting #target=p81 zasm knows about some peculiarities which it cares about. and without setting E_LINE to the correct address your program won't load successfully. This is required by the ZX81 tape loading routine, not by zasm.

2682:         ; (RAMTOP)              End of memory (address of last byte (incl.))
      ^ segment SYSVARS: E_LINE must match ram end address $4074 (E_LINE=$5B5D)
assembled file: efz80-zx81.asm

I still don't understand the error. E_LINE is set to a label located in the source where E_LINE should point (E_LINE=$5B5D) as I understand. E_LINE is after the REM statement, display file etc, according to the template file. $4074 seems to me to refer to an undocumented area in the ZX81 system variables. Is the error indicating that $5B5D should be stored in location $4074?

Paul

Megatokio commented 10 months ago

hm, i think this is a result of setting the ORG back. zasm probably compares E_LINE with the current code pointer which is presumably pointing somewhere inside the code. I'll check this. Kio ! edit: actually because of this and because the ORG was moved around and because there is an error in the message... oh dear!

Megatokio commented 10 months ago

I have updated the online assembler. please give it another try. Kio !

Megatokio commented 10 months ago

btw., if the library headers are at the end of ram and you save this as a .p tape file, isn't then a whole lot of empty space included in the tape file? Kio !

pdr0663 commented 10 months ago

btw., if the library headers are at the end of ram and you save this as a .p tape file, isn't then a whole lot of empty space included in the tape file? Kio !

I'm a little ignorant about what actually gets saved onto the tape file. I assume that it is everything from $4009 up to but not including E_LINE, as what follows in the BASIC system is stacks etc which are dynamic. That's consistent with the .p template. I understand the BASIC system upon boot-up scans available RAM and sets RAMTOP to the first byte above the top. The GOSUB and machine stacks are placed immediately under RAMTOP, and there is a large space down to the STKEND which is the upper end of the calculator stack. I've verified this by POKEing the system variables on a 16k system.

BTW if by "library headers' you're referring to the names in the example I've provided, they are located near the end of the REM statement, and grow downwards towards the compiled Forth code, which is growing upwards. This is an oddity of eForth, most other Forths have a unified name and code structure, and grow in one direction. In eForth, there is necessarily some free space between the name and code spaces, to allow for new names and new code. This is effectively the working space of the eForth system, and I assume would need to be present in the tape file.

Paul

pdr0663 commented 10 months ago

Ok, it seems to assemble just fine now. For the Eighty-One emulator, I guess I target "p", right? Can I please download the source or CygWin compiles executable?

Megatokio commented 10 months ago

.p or .81 both will be ok.

i'll get the cygwin binary compiled, but can't say in advance how long it will take. one day i'll dive into this yaml stuff to make an action to compile it on github, but not today. download the source: you are at the source. at the zasm start page click on the CODE button. if you clone with git take care the Librariessubmodule is clone too. If you download the zip download the zip for Libraries too and place it's contents in the Libraries folder inside the zasm source. The Makefile works out of the box on my Linux, i expect it to work on Cygwin too. Kio !

Megatokio commented 10 months ago

btw. the approach with one #code segment per word works fine but has an interesting catch: zasm allocates 64k per segment, which is normally no problem, but if you have 100 words this means 6.4MB. :-) no problem nowadays thou. Kio !

pdr0663 commented 10 months ago

BTW this is wrong:

; (RAMTOP)              End of memory (address of last byte (incl.))
;
; value of RAMTOP:
;       $43FF = 17407 For 1k internal ram (ZX81)
;       $47FF = 18431 For 2k internal ram (TS1000)
;       $7fff = 32767 for 16k Ram Pack
;       $bfff = 49151 for 32k Ram Pack
;       $ffff = 65535 for 64k Ram Pack

RAMTOP is set by the ZX81 system to the first byte AFTER the last physical memory address.

E.g. in a 16k system, RAMTOP is set to 32768, not 32767, so all the values above are incorrect. In that case, I don't know how a straight 64k system would be supported, as RAMTOP would need to be set to a value > 16-bits.

Megatokio commented 10 months ago

Thank you, i didn't know this. i'll correct it! :-) Kio !

Megatokio commented 10 months ago

Hi Paul, i replied to one (now deleted?) comment directly. please tell me whether you received this email. Kio !

pdr0663 commented 10 months ago

Kio,

This is a reply directly from my email. If you want to use email directly, my address is

@.***

Paul

On Fri, 17 Nov 2023 at 10:06 pm, Kio @.***> wrote:

Hi Paul, i replied to one (now deleted?) comment directly. please tell me whether you received this email. Kio !

— Reply to this email directly, view it on GitHub https://github.com/Megatokio/zasm/issues/33#issuecomment-1816170831, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADECSMY4GFKERRIJ5QIW3LTYE5AK5AVCNFSM6AAAAAA66U2KAGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMJWGE3TAOBTGE . You are receiving this because you authored the thread.Message ID: @.***>

Megatokio commented 10 months ago

Kio, This is a reply directly from my email. If you want to use email directly, my address is @.*** Paul

oh yes, gibhub protects us from any direct communicatrion... i have set my email address to publically visible, maybe it is shown somewhere in my profile.

from my email:

unfortunately i don't know very much about the ZX81, i grew up with a ZX Spectrum. One first impression is that all the system variables are still set to 0. this probably is wrong as some must be set properly.

i suggest you save the very same program in EightyOne and compare the .p files with a hex editor and make sense of the difference.

Kio !