uho / preForth

a minimalistic Forth kernel that can bootstrap
GNU General Public License v3.0
72 stars 9 forks source link

65C02 port #13

Open nickd4 opened 2 years ago

nickd4 commented 2 years ago

Adds a 65C02 port of preForth and seedForth, and a 65C02 emulator based on https://github.com/visrealm/vrEmu6502 as a submodule that refers to my nickd4 github account (because I had to make some minor changes to the emulator backend).

The 65C02 port behaves basically the same as the Z80 port, as they are both direct threaded, so the only real change to the Forth code was in changing "call" to "jsr" and changing the relevant opcode. The assembly code does change a bit due to 65C02 vs Z80 differences -- we don't use the hardware stack for 65C02 because preForth requires >256 bytes for stack strings, and this means some changes to dodoes and dovar to move return addresses from the hardware stack to our IP or stack.

I've also added some debugging facilities, you can recompile either the Z80 or 65C02 emulator to produce a trace file on stderr and there's an annotator that allows a kind of symbolic debug after the fact, by interpreting the addresses in the trace relative to the symbol table produced when compiling preForth or seedForth. The annotator is written in Python, I would like to rewrite this in Forth at some stage, but I'd have to recreate some Python facilities like bisect, split, join and int so I didn't do this yet.

You can use the -t switch to either emulator for benchmarking (this does not require recompiling the emulator), for example

nick@jane:~/src/preForth/z80$ ../emu_z80/emu_z80 -t seedForth.bin seedForthDemo.seed
..................................
done
2344885 instructions executed on 18899494 cycles

versus

nick@jane:~/src/preForth/65c02$ ../emu_65c02/emu_65c02 -t seedForth.bin seedForthDemo.seed
..................................
done
4076846 instructions executed on 15631102 cycles

This is an interesting result, since at least for this inner interpreter, it's not true that a 1 MHz 65C02 would be equivalent to a 2..4 MHz Z80 as people are often saying. It would be closer to a 1.2 MHz Z80. On the other hand, it's clearly true that we are programming the Z80 at a higher level, since the Z80 is taking only 58% of the number of instructions @ 8.1 cycles each versus 3.8 cycles each for 65C02. This could account for why the Z80 is popular and feels somewhat easier to work with.

Another interesting metric is the size of the binary images,

nick@jane:~/src/preForth/z80$ ls -l *.bin
-rw-rw-r-- 1 nick nick 2612 Apr 29 13:14 preForth.bin
-rw-rw-r-- 1 nick nick  884 Apr 29 13:14 preForthDemo.bin
-rw-rw-r-- 1 nick nick 1288 Apr 29 13:18 seedForth.bin

versus

nick@jane:~/src/preForth/65c02$ ls -l *.bin
-rw-rw-r-- 1 nick nick 2899 Apr 29 17:23 preForth.bin
-rw-rw-r-- 1 nick nick 1177 Apr 29 17:23 preForthDemo.bin
-rw-rw-r-- 1 nick nick 2129 Apr 29 17:23 seedForth.bin

For preForth, which contains comparatively little assembly code (and mostly high level Forth), the difference isn't that huge. But for seedForth, which contains more assembly code, including longer routines like um* and um/mod, the Z80 version is only 60% of the size, again suggesting that we are programming the Z80 at a higher level e.g. delegating 16-bit operations to the Z80. Of course, if you do not delegate those operations and program them yourself on 65C02, you can optimize them a bit.

Having said all that, we should keep in mind that this particular application does not really play to the 65C02's strengths, since the requirement for 16-bit stack pointers is atypical. Ordinary applications using the 65C02's hardware stack like JSR, RTS, PLA, PHA and its addressing mode $address,X which supports smaller arrays, would be a lot faster. In this regard, the decision in preForth to use stack-strings is a bit unfortunate, and I guess commercial 6502 Forths would have had limited stack sizes.