16-bit subtraction is necessary for C

ytmytm commented 3 years ago

We need 16-bit arithmetic because pointers (addresses, vectors) need to be of the same size as values.

This is necessary to make reference programs work (self-interpreter, 99 bottles) and for simplified C http://mazonka.com/subleq/hsq.html

We have 8-bit arithmetic already. Now we only need to do it twice, with Carry flag implementation - more self-modification, more lookup tables.

laubzega commented 3 years ago

Does it mean that all SUBLEQ memory cells would be 16-bit?

ytmytm commented 3 years ago

Yes, all SUBLEQ VMs I saw use the same data and address width.

I would keep this issue on hold for a while. C compiler just makes it easier to express basic operations (function call/return, stack, looping, integer arithmetic, conditional jumps). Higher-level assembler macros can fulfill this need.

laubzega commented 3 years ago

On the other hand, while implementing Hello World I realized that the combination of absolute addresses and 8-bit pointer arithmetic is going to be problematic. Now I wonder about relative complexities of a VM that multiplies all address references by two vs one that stores memory cell's MSBs and LSBs in separate banks (but at the same address)...

ytmytm commented 3 years ago

one that stores memory cell's MSBs and LSBs in separate banks (but at the same address)

I had that idea too :).

At the moment I'm having doubts if 16-bit access is maybe more trouble than benefit.

I have been trying to figure out the Forth for SUBLEQ: https://github.com/howerj/subleq

If I'm reading it correctly (not easy with Forth) the stacks pointers are counting bytes there, not 16-bit words. I'm not sure why it's like that, maybe standard calls for it.

Because of this there is quite a lot of code that multiplies and divides adresses by two and shifts data to extract low or high byte from values. It is a lot of work to store a single byte from top of the stack into memory:

:t c!  swap FF lit and dup 8 lit lshift or swap
   tuck dup @ swap lsb 0= FF lit xor
   >r over xor r> and xor swap ! ;t

It calls word fetch @ then word stash !, both of them need to divide address by two before actually calling native code. Then our VM would have to reverse that work twice.

With 8-bit access this is much easier - byte stash is done in native code, word stash is byte stash done twice and addresses never need to be adjusted.

laubzega commented 3 years ago

Based on https://forth-standard.org/standard/usage, it looks like the smallest datatype takes "one cell" when on stack. Then, https://forth-standard.org/standard/usage#subsection.3.1.3 says that "Cells shall be at least one address unit wide and contain at least sixteen bits,". Also, looking at page 14 of http://www.exemark.com/FORTH/eForthOverviewv5.pdf, eForth uses 16-bit registers for stack operations.

"Address units" may in turn be 8-bits: https://forth-standard.org/standard/port#port:hardware has this to say: "The address space of a Forth system is divided into an array of address units; an address unit is the smallest collection of bits that can be addressed. In other words, an address unit is the number of bits spanned by the addresses addr and addr+1. The most prevalent machines use 8-bit address units, but other address unit sizes exist."

ytmytm / c64-beamracer-subleq

16-bit subtraction is necessary for C #2