SamCoVT / TaliForth2

A Subroutine Threaded Code (STC) ANSI-like Forth for the 65c02
Other
28 stars 4 forks source link

Compile streamlining; more flexible native vs non-native compile #78

Closed patricksurry closed 2 months ago

patricksurry commented 3 months ago

I simplified a few things preferring simpler/smaller over faster for compile-time words, and split out compile.asm containing compile, along with supporting routines from taliforth.asm.

Notable changes:

patricksurry commented 3 months ago

hold off on this one for a bit, i'm going to split it into two parts since I have some more literal stuff

SamCoVT commented 3 months ago

No problem - just let me know when you'd like me to take a look at it.

patricksurry commented 3 months ago

Example showing all inline (excl sliteral) vs all subroutine. Disassembler distinguishes 2literal from sliteral and 1...4 stack check.

257 nc-limit !  ok
: bar s" hello" 12345678. ;    ok
see bar 
...
size (decimal): 33 
...
80B    813 jmp
813   A189 jsr     SLITERAL 80E 5 
81A     61 ldy.#
81C     4E lda.#
81E        dex
81F        dex
820      0 sta.zx
822      1 sty.zx
824     BC lda.#
826        dex
827        dex
828      0 sta.zx
82A      1 stz.zx
 ok
: fib 0 1 rot 0 ?do over + swap loop drop ;  ok
see fib 
...
size (decimal): 159 
...
838        dex
839        dex
83A      0 stz.zx
83C      1 stz.zx
83E        dex
83F        dex
840      1 lda.#
842      0 sta.zx
844      1 stz.zx
846   D7A7 jsr     3 STACK DEPTH CHECK
849      5 ldy.zx
84B      3 lda.zx
84D      5 sta.zx
84F      1 lda.zx
851      3 sta.zx
853      1 sty.zx
855      4 ldy.zx
857      2 lda.zx
859      4 sta.zx
85B      0 lda.zx
85D      2 sta.zx
85F      0 sty.zx
861        dex
862        dex
863      0 stz.zx
865      1 stz.zx
867      0 lda.zx
869      2 cmp.zx
86B      D bne      87A v
86D      1 lda.zx
86F      3 cmp.zx
871      7 bne      87A v
873        inx
874        inx
875        inx
876        inx
877    8D2 jmp
87A   85BB jsr     DO 
87D   D7A2 jsr     2 STACK DEPTH CHECK
880        dex
881        dex
882      4 lda.zx
884      0 sta.zx
886      5 lda.zx
888      1 sta.zx
88A   D7A2 jsr     2 STACK DEPTH CHECK
88D        clc
88E      0 lda.zx
890      2 adc.zx
892      2 sta.zx
894      1 lda.zx
896      3 adc.zx
898      3 sta.zx
89A        inx
89B        inx
89C   D7A2 jsr     2 STACK DEPTH CHECK
89F      0 lda.zx
8A1      2 ldy.zx
8A3      2 sta.zx
8A5      0 sty.zx
8A7      1 lda.zx
8A9      3 ldy.zx
8AB      3 sta.zx
8AD      1 sty.zx
8AF     20 inc.z
8B1      D bne      8C0 v
8B3     1F ldy.z
8B5    101 lda.y
8B8        inc.a
8B9     80 cmp.#
8BB      6 beq      8C3 v
8BD    101 sta.y
8C0    87D jmp
8C3     1F ldy.z
8C5        dey
8C6        dey
8C7        dey
8C8        dey
8C9     1F sty.z
8CB      5 bmi      8D2 v
8CD    100 lda.y
8D0     20 sta.z
8D2   D79D jsr     1 STACK DEPTH CHECK
8D5        inx
8D6        inx
 ok
0 nc-limit !  ok
: bar s" hello" 12345678. ;   redefined bar  ok
see bar 
...
size (decimal): 22 
...
8E3    8EB jmp
8EB   A189 jsr     SLITERAL 8E6 5 
8F2   A189 jsr     2LITERAL BC614E 
 ok
: fib 0 1 rot 0 ?do over + swap loop drop ; redefined fib  ok
see fib 
...
size (decimal): 40 
...
905   9E04 jsr     0
908   9D9E jsr     1
90B   8F83 jsr     rot
90E   9E04 jsr     0
911   85A3 jsr     ?DO 92A 
916   85BB jsr     DO 
919   8CFD jsr     over
91C   8E4B jsr     +
91F   9242 jsr     swap
922   8ACB jsr     LOOP 919 
927   95D6 jsr     unloop
92A   8699 jsr     drop
 ok
patricksurry commented 3 months ago

This should be good to take a look. Saves more space with some speed improvements, and more flexibility between native and subroutine compile.

It can native compile all conditional branching / jumps without any jsr + two byte payload which is much faster.

Setting nc-limit to zero now generates (almost?) pure subroutine threading with no native code. This saves more space in that mode, and opens the possibility of a 'token threaded' mode where a sequence of three-byte JSR xxyy could be replaced by (mostly) single byte tokens. I've mocked something for that, hopefully ready to share in a week or so.