SamCoVT / TaliForth2

A Subroutine Threaded Code (STC) ANSI-like Forth for the 65c02
Other
29 stars 7 forks source link

Add forward labels for assembler #61

Open SamCoVT opened 8 months ago

SamCoVT commented 8 months ago

It would be nice to have forward labels for the assembler - currently only anonymous backward labels are provided (using --> (note that this is a reuse of the old "next-screen" FIG-FORTH block word - might consider renaming it), <j and <b). Forward labels are more difficult because they have to go back and modify already compiled code.

They also might be referenced in multiple places (which can be a real hassle), but I'd be happy even with a forward label that supports only one reference, like Tali's current forward labels do, or "up to x references" with an error if there are more. The memory to store the reference locations will need to be stored separately in the case of branches, but could be stored in the compiled routines themselves (as a null terminated linked list) for opcodes that compile two bytes.

I'm thinking that these labels would need to be pre-declared before using them, and it would be nice to be able to re-initalize them for re-use.

I don't expect people to be using Tali's assembler for large bits of assembly - tass64 is better for that - so this really is just to support small branches/jumps within a word being written in assembly

SamCoVT commented 7 months ago

Here is the Forth version that supports a single reference:

\ PROGRAMMER  : Sam Colwell
\ FILE        : assembler_forward_labels.fs
\ DATE        : 2024-04-12
\ DESCRIPTION : This implements forward labels for the assembler in TaliForth2.
\ Example Usage : j> jmp ( some assembly) <--
\                 b> bmi ( some assembly ) <--

\ <-- ( addr flag -- )  Anonymous forward label - patches jmp/branch reference
\                       at addr.  Flag is true for jmp and false for branch.
\                       addr is address of jmp or branch opcode.
: <--
  if ( patch jmp ) here swap 1+ !
  else ( patch branch ) dup   here swap - 2 -   swap 1+ c!
  then ; immediate

\ j> ( -- addr true 0 ) Used to compile a jmp when address is forward.
\                       Will be resolved later with <--
: j>   here true 0 ; immediate

\ b> ( -- addr false 0 ) Used to compile a branch when address is forward.
\                       Will be resolved later with <--
: b>   here false 0 ; immediate

and some simple tests:

assembler-wordlist >order
: testwordj j> jmp inx dex <-- iny dey ;
: testwordb iny b> beq inx dex <-- dey ;

using see on the test words shows correct behavior for those tests.

leepivonka commented 5 months ago

Here are some ideas for assembly labels:

` Structured flow control for assembly:------------------------------ The trailing ',' distinguishes these from the Forth versions. These generate 8bit relative addresses, assuming that is enough.

Begin, xxxx Again, Begin, xxxx UntilEq, Begin, xxxx WhileEq, xxxx Repeat,

IfEq, xxxx Then, IfCc, xxxx Else, xxxx Then,

Local labels:--------------------------------------------------- Definitions & references are accumulated in an in-memory database. It's OK to reference a label not defined yet. At the end of the word definition, LblFinish patchs all the references.

LEqu ( n "name" -- ) define a local label = n LDef ( "name" -- ) define a local label = Here LRefW ( "name" -- n ) reference a local label, word LRefA ( "name" -- n ) reference a local label, absolute instruction LRefR ( "name" -- n ) reference a local label, 8bit relative instruction LblInit ( -- ) Initialize local label database LblFinish ( -- ) patch all references

Example:--------------------------------------------------------

\ http://forum.6502.org/viewtopic.php?f=2&t=7720 ok ok \ This was developed in a FORTH on a 65816 in native mode. ok ok \ ------------------------------------------------------------------------------ ok : uint32_div8 ( ud_dividend u_divisor -- ud_quotient u_remainder ) compiled \ --- Divide a 32-bit unsigned integer by a 8-bit uint. compiled \ --- Returns: 32-bit uint quotient compiled \ --- 8-bit uint remainder compiled \ compiled \ ON ENTRY parameter stack contains compiled \ - UINT32 - dividend compiled \ - UINT16 - divisor (only lower 8 bits used) compiled \ ON EXIT parameter stack contains compiled \ - UINT32 - quotient compiled \ - UINT16 - remainder (only lower 8 bits used) compiled \ compiled \ Reducing the divisor & remainder to 8bits allows remainder to compiled \ be resident in A & simplifies the compare & subtract. compiled \ Inputs & outputs have been moved to the FORTH parameter stack. compiled compiled [ $20 ## sep, \ M to 8bit like 6502 ok ok 0 ## lda, \ init remainder ok 32 ## ldy, Begin, \ for each dividend bit ok 4 d,x asl, 5 d,x rol, 2 d,x rol, 3 d,x rol, \ shift dividend/quotient ok rola, \ shift remainder ok LRefR @15 bcs, \ if remainder overflowed, remainder > divisor ok 0 d,x cmp, \ remainder >= divisor? ok LRefR @18 bcc, ok LDef @15 ok 0 d,x sbc, \ remainder -= divisor ok 4 d,x inc, \ set quotient bit ok LDef @18 ok dey, UntilEq, ok 0 d,x sta, \ store remainder ok ok $20 ## rep, \ M back to 16bit ok ] ; SeeLatest 05BF E220 SEP #$20 {Local} 05C1 A900 LDA #$00 05C3 A02000 LDY #$0020 {' SOutIndx0} 05C6 1604 ASL $04,x 05C8 3605 ROL $05,x 05CA 3602 ROL $02,x 05CC 3603 ROL $03,x 05CE 2A ROLA 05CF B004 BCS $05D5 {uint32_div8+0016} 05D1 D500 CMP $00,x 05D3 9004 BCC $05D9 {uint32_div8+001A} 05D5 F500 SBC $00,x 05D7 F604 INC $04,x 05D9 88 DEY 05DA D0EA BNE $05C6 {uint32_div8+0007} 05DC 9500 STA $00,x 05DE C220 REP #$20 {Local} 05E0 60 RTS

ok ok : Test8 ( ud1 u2 -- ) \ test case processor compiled CC@ 2>R uint32_div8 CC@ 2R> D- D. ." cycles, " \ run & time it compiled ." rem=" U. ." quo=" UD. \ display outputs compiled ; ok ok \ +-- dividend ok \ | +-- divisor ok \ | | +-- run the test case processor ok \ V V V ok

  1. 0 Test8 1609 cycles, rem=10 quo=4294967295 ok
    1. 5 Test8 1321 cycles, rem=0 quo=0 ok
  2. 3 Test8 1339 cycles, rem=1 quo=3 ok
    1. 40 Test8 1375 cycles, rem=3 quo=1000 ok
      1. 236 Test8 1377 cycles, rem=233 quo=523121 ok
      2. 1 Test8 1492 cycles, rem=0 quo=3987654321 ok
      3. 2 Test8 1483 cycles, rem=1 quo=1993827160 ok
      4. 255 Test8 1377 cycles, rem=21 quo=15637860 ok ok \ Type 32bit 1MSec counter as HMS time ok
      5. \ 1MSec 32bit binary counter value ok 10 uint32_div8 U. \ thousanths of second 1 ok 100 uint32_div8 U. \ hundredths of second 32 ok 60 uint32_div8 U. \ seconds 54 ok 60 uint32_div8 U. \ minutes 40 ok 24 uint32_div8 U. \ hours 3 ok UD. \ days 46 ok eof ok

A=00FE X=00C6 Y=EF74 S=0478 envMxdIZC D=0000 B=00 2BB00728 00ef81 pld .`

patricksurry commented 4 months ago

I'd find it more intuitive to have the placeholder after the branch/jump instruction.

For the targets I like the look of beq b> ... <| and |> ... jmp j< to mimic a fence line | and a direction arrow.

Or maybe you could find a way to mimic anonymous +/- assembler labels somehow, e.g. b+ ... +| and |- ... j- ??

Examples:

jmp j> inx dex <| iny dey
iny beq b> inx dex <| dey

inx |> dex bne b<
dex |> inx jmp j<
patricksurry commented 4 months ago

reusable backward labels would work if the current label target was a variable rather than putting it on the stack. e.g. --> just sets the target to here, and b<, j< use that to calculate the offset or address.

reusable forward labels could work like leave and use the placeholder offset (b>) or addr (j>) to link to the previous thing that needs updated. again you'd have an associated variable which tracks the previous reference to the forward label which starts 0 and resets to 0 when the forward reference is resolved.

e.g. bne b> inx bmi b> asl j> dex <| would initially generate code like

b0:  
  bne 0   \ store 0 for first reference (previous is 0), update previous to here=b0
  inx 
b1: 
  bmi (b1-b0) \ store here - previous (single byte is enough for legal branches), update previous to here=b1
  asl  
j0: 
  jmp b1 \ store previous as two bytes, update previous to here=j0
  dex
target: \ walk the linked list of references backward thru previous, updating the target address and calculating the next previous until we get to 0 again.

I guess you have a pathological case like jmp j> ... 128+ bytes ... b> ... <| where the last branch can't chain back to the jmp, so maybe you track both jprev and bprev?

could even generalize to named labels where you track the previous value(s) and target address for each, or you flip state when the label is first defined.

It'd be an error if any label remains undefined when you exit compile state, or if you try to chain sequential branch addresses more than 128 bytes apart, or if the label is defined more than 128 bytes past the last branch.

patricksurry commented 4 months ago

maybe you could avoid distinguishing fwd/backward altogether and have one or more temp labels like :_ :1 :2 ... that you could reference with words like @_ @1 @2 ....?

for example: jmp @_ ... bpl @_ inx :_ dey ... bmi @_ .. jmp @_

Having the label reference after the mnemonic gives the j/b sense without needing explicit tag: is the previous byte a branch opcode or not?

each label like _ would have a struct like .bit defined? / .bits7 bprev / .word jprev | addr where defined? is false with bprev, jprev chaining fwd references until :_ occurs to set addr which is then used to resolve subsequent back refs. could either undefine labels automatically on exiting compile state, or have some explicit reset word like \_ or _;