Open ytmytm opened 3 years ago
Does it mean that all SUBLEQ memory cells would be 16-bit?
Yes, all SUBLEQ VMs I saw use the same data and address width.
I would keep this issue on hold for a while. C compiler just makes it easier to express basic operations (function call/return, stack, looping, integer arithmetic, conditional jumps). Higher-level assembler macros can fulfill this need.
On the other hand, while implementing Hello World I realized that the combination of absolute addresses and 8-bit pointer arithmetic is going to be problematic. Now I wonder about relative complexities of a VM that multiplies all address references by two vs one that stores memory cell's MSBs and LSBs in separate banks (but at the same address)...
one that stores memory cell's MSBs and LSBs in separate banks (but at the same address)
I had that idea too :).
At the moment I'm having doubts if 16-bit access is maybe more trouble than benefit.
I have been trying to figure out the Forth for SUBLEQ: https://github.com/howerj/subleq
If I'm reading it correctly (not easy with Forth) the stacks pointers are counting bytes there, not 16-bit words. I'm not sure why it's like that, maybe standard calls for it.
Because of this there is quite a lot of code that multiplies and divides adresses by two and shifts data to extract low or high byte from values. It is a lot of work to store a single byte from top of the stack into memory:
:t c! swap FF lit and dup 8 lit lshift or swap
tuck dup @ swap lsb 0= FF lit xor
>r over xor r> and xor swap ! ;t
It calls word fetch @
then word stash !
, both of them need to divide address by two before actually calling native code. Then our VM would have to reverse that work twice.
With 8-bit access this is much easier - byte stash is done in native code, word stash is byte stash done twice and addresses never need to be adjusted.
Based on https://forth-standard.org/standard/usage, it looks like the smallest datatype takes "one cell" when on stack. Then, https://forth-standard.org/standard/usage#subsection.3.1.3 says that "Cells shall be at least one address unit wide and contain at least sixteen bits,". Also, looking at page 14 of http://www.exemark.com/FORTH/eForthOverviewv5.pdf, eForth uses 16-bit registers for stack operations.
"Address units" may in turn be 8-bits: https://forth-standard.org/standard/port#port:hardware has this to say: "The address space of a Forth system is divided into an array of address units; an address unit is the smallest collection of bits that can be addressed. In other words, an address unit is the number of bits spanned by the addresses addr and addr+1. The most prevalent machines use 8-bit address units, but other address unit sizes exist."
We need 16-bit arithmetic because pointers (addresses, vectors) need to be of the same size as values.
This is necessary to make reference programs work (self-interpreter, 99 bottles) and for simplified C http://mazonka.com/subleq/hsq.html
We have 8-bit arithmetic already. Now we only need to do it twice, with Carry flag implementation - more self-modification, more lookup tables.