SamCoVT / TaliForth2

A Subroutine Threaded Code (STC) ANSI-like Forth for the 65c02
Other
29 stars 5 forks source link

Support single byte literals #46

Closed patricksurry closed 5 months ago

patricksurry commented 5 months ago

I noticed that the forth code I'm working on includes many uchar size constants, which each compile to five bytes (jsr literal_runtime; lsb, msb=0).

This change saves a byte for every such compiled constant and runs slightly faster. It adds about 60 bytes to the taliforth binary size.

At compile time it checks if the literal has MSB=0 and if so switches from , to c, with a new byte_runtime.
Disassembler modified accordingly, and added a special handler table to reduce boilerplate.

Example:

0 nc-limit !  ok     
: foo 1 2 3 45 6789 + + + + ;  ok
see foo 
nt: 800  xt: 80B 
flags (CO AN IM NN UF HC): 0 0 0 1 0 1 
size (decimal): 31 

080B  20 2B 98 20 9B A3 20 C3  93 03 20 C3 93 2D 20 A4   +. .. . .. ..- .
081B  93 85 1A 20 06 9A 20 06  9A 20 06 9A 20 06 9A  ... .. . . .. ..

80B   982B jsr     1
80E   A39B jsr     2
811   93C3 jsr     BLITERAL 3 
815   93C3 jsr     BLITERAL 2D 
819   93A4 jsr     LITERAL 1A85 
81E   9A06 jsr     +
821   9A06 jsr     +
824   9A06 jsr     +
827   9A06 jsr     +
 ok
foo . 6840  ok
SamCoVT commented 5 months ago

I had thought about doing this, but came to the conclusion that it wasn't worth it for saving one byte at a time (you'd have to have 60 character variables to get the size increase back, if Tali were RAM resident). You are welcome to try to change my mind, but I'm unlikely to merge this into Tali at this time.

You can certainly keep it in your repo, even if I don't merge it here, but you'll probably want to keep it in a branch so it will be easier to generate pull requests for other things without having it in the commit history.

patricksurry commented 5 months ago

sure, i'm happy to keep it as a local change in one of my other branches.

out of curiosity, why do you say you'd need 60 character variables? wouldn't you just need 60 occurrences of integer literals <256 in your compiled code?

my code has zillions of single byte constants that i've been inlining with a preprocessor so I end up with one- to three-digit integers sprinkled everywhere which actually pays off here.

(aside: would love a scheme where I could get to 2 or 3 bytes rather than 4 or 5 for a literal, but imagine that would require messing with the fundamental subroutine threading mechanism which sounds serious... :)

SamCoVT commented 5 months ago

Yes, you are right about it just being literals - I had confounded it with variables. I think the closest you would get to 2-byte 8-bit literals would be to use BRK and put the byte following the BRK instruction (as the "signature byte". Tali currently doesn't have any support for BRK (and py65mon stops on BRK, by default). You'd also need an ISR (or to modify an existing ISR) to handle BRK as an interrupt source. Because Tali is subroutine threaded, it's running code natively most of the time and there isn't a good way to "notice" an upcoming data byte.

Does it make sense at all to use Tali as an assembler with a very powerful macro capability? When I need to work entirely in 8-bit, I usually just write it in assembly. The power of Tali is that you can then mix and match your assembly words with Forth words.

SamCoVT commented 5 months ago

Also, just in case it might be good for you to know, literals, values, variables, and constants are all handled by Tali the same way, but with different "handlers", so the JSR address before the value is different. Only literals don't have a name, but for the others, you can modify all of them (even named constants) using to (the companion word for value). It's not proper Forth, but rather a side effect of how Tali implements them.

5 value x  ok
x .  5  ok
12 to x  ok
x . 12  ok
variable y  ok
y @ . 0  ok
55 to y  ok
y @ . 55  ok
33 constant z  ok
z . 33  ok
22 to z  ok
z . 22  ok
patricksurry commented 5 months ago

Oh, that's cool. Need a little taliforth tips & tricks list somewhere. The ctrl-n/p stuff is pretty nice too, but I took a long time to realize it existed :)

On Mon, Mar 18, 2024 at 2:30 PM SamCoVT @.***> wrote:

Also, just in case it might be good for you to know, literals, values, variables, and constants are all handled by Tali the same way, but with different "handlers", so the JSR address before the value is different. Only literals don't have a name, but for the others, you can modify all of them (even named constants) using to (the companion word for value). It's not proper Forth, but rather a side effect of how Tali implements them.

5 value x ok x . 5 ok 12 to x ok x . 12 ok variable y ok y @ . 0 ok 55 to y ok y @ . 55 ok 33 constant z ok z . 33 ok 22 to z ok z . 22 ok

— Reply to this email directly, view it on GitHub https://github.com/SamCoVT/TaliForth2/pull/46#issuecomment-2004648921, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABA5MKQ4ADG6JQTOLUJ3BA3YY4XFJAVCNFSM6AAAAABE3HXXOOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBUGY2DQOJSGE . You are receiving this because you authored the thread.Message ID: @.***>

patricksurry commented 5 months ago

btw, do you want me to put that disassembler refactoring (excluding the single byte literal stuff) in a separate PR? it just checks special handlers in a loop over a table to save some boilerplate and make it easier to add new ones down the road. (no functional change)

SamCoVT commented 5 months ago

Yeah - to just looks up the XT, advances 3 bytes to hop over the JSR (doesn't check it to see if it's a VALUE or even if there is a JSR there) and modifies the value after the JSR directly. It's a bit like running with scissors, as it will so that for ANY name given after the TO.

I agree that CTRL-N/CTRL-P is handy. I spent way too much time getting it to work the way I wanted so that you could fix a line in a multi-line word and then keep CTRL-N + ENTERing to enter the remaining lines. If you have a good place to mention the history in the manual, let me know - otherwise I think adding a little tutorial that shows what was typed in and what the result was might be good (complete with an example mistake in a word that can be fixed by redefining the word). That was one of your suggestions.

I do think the table-based approach for the special handlers in the disassembler is probably good to move to - so yes, you should send that as a separate PR. By the time I was writing the third handler, I was starting to think a table based approach might be a good idea, and a table based approach makes it much easier for people hacking Tali to add their own handlers for their custom data types (like your single byte literals).

patricksurry commented 5 months ago

pls feel free to close this one.