SamCoVT / TaliForth2

A Subroutine Threaded Code (STC) ANSI-like Forth for the 65c02
Other
29 stars 5 forks source link

show relative branch target address #48

Closed patricksurry closed 5 months ago

patricksurry commented 5 months ago

here's a little thing which I think improves disasm usability (which is great!)

native branch instructions currently just show the (unsigned) operand byte, which makes it hard to see where they're going. this PR instead displays the target of the branch as an absolute address. Here are a couple of examples including both forward and backward branches, and bra and the 6502 bxx branches:

see <> 
...
965C  20 1F D8 A0 00 B5 00 D5  02 D0 0A B5 01 D5 03 D0   ....... ........
966C  04 A9 FF 80 01 88 98 E8  E8 95 00 95 01  ........ .....

965C   D81F jsr     STACK DEPTH CHECK
965F      0 ldy.#
9661      0 lda.zx
9663      2 cmp.zx
9665   9671 bne
9667      1 lda.zx
9669      3 cmp.zx
966B   9671 bne
966D     FF lda.#
966F   9672 bra
9671        dey
9672        tya
9673        inx
9674        inx
9675      0 sta.zx
9677      1 sta.zx
 ok

see endcase 
...
8E0C  A0 8D A9 3A 20 BD D6 B5  00 15 01 F0 05 20 9F A1  ...: ... ..... ..
8E1C  80 F5 E8 E8  ....

8E0C     8D ldy.#
8E0E     3A lda.#
8E10   D6BD jsr     
8E13      0 lda.zx
8E15      1 ora.zx
8E17   8E1E beq
8E19   A19F jsr     then
8E1C   8E13 bra
8E1E        inx
8E1F        inx
 ok
SamCoVT commented 5 months ago

I definitely like the target address being listed. One issue is that this breaks a feature of the disassembler, in that you can take the output and (after stripping off the addresses and names of known words/routines on the left/right) feed it back into Tali's assembler to generate the exact same code (doesn't work for all scenarios, but does for straight assembly). The BRA type instructions use the LSB of whatever is TOS, which would result in a different branch target if you tried to feed this output back into Tali.

How difficult would it be to leave the argument to the branch the same (8-bit offset) and print the target address on the right instead? It might be good to have word/pharse there to indicate it's a destination address, perhaps "to" or "dest." followed by the destination. Let me know what you think.

patricksurry commented 5 months ago

ya, i wondered about that. it doesn't look like you use a comment character elsewhere to separate the RHS stuff. that might make it even easier to paste back in if the asm supports one?

for now I'll try to ABCD and => ABCD on the right and see how it looks.

SamCoVT commented 5 months ago

It might be worth adding a \ before the right-hand stuff... but if we do that, we should probably put the address of the instruction on the left in parens as well? That might look like:

( 965C )   D81F jsr     \ STACK DEPTH CHECK

The stuff that doesn't process properly is the non-assembly - that's mostly literals and string literals. The literals probably could be done easily, but I don't have a good way to make string literals "re-assemble-able", with the string literals being especially tricky because of s\". That would require a reverse encoder for the special characters, including all of the non-printing characters. I don't think I want to increase the disassembler size by that much, but would entertain any solution that only results in a small disassembler size increase.

The disassembler will still gack on words that the user creates that are a mix of assembly and data because it wont know where the data starts/ends, but otherwise I think it has good functionalty and adding the target address is certainly an enhancement with very little cost in size.

patricksurry commented 5 months ago

moved the destination to the right so you can still see the operand. also added a trailing forward/back indicator to make it easier to read at a glance. wdyt?

see <> 
...
965C  20 3D D8 A0 00 B5 00 D5  02 D0 0A B5 01 D5 03 D0   =...... ........
966C  04 A9 FF 80 01 88 98 E8  E8 95 00 95 01  ........ .....

965C   D83D jsr     STACK DEPTH CHECK
965F      0 ldy.#
9661      0 lda.zx
9663      2 cmp.zx
9665      A bne     9671 v
9667      1 lda.zx
9669      3 cmp.zx
966B      4 bne     9671 v
966D     FF lda.#
966F      1 bra     9672 v
9671        dey
9672        tya
...
see endcase 
...
8E0C  A0 8D A9 3A 20 DB D6 B5  00 15 01 F0 05 20 9F A1  ...: ... ..... ..
8E1C  80 F5 E8 E8  ....
...
8E0C     8D ldy.#
8E0E     3A lda.#
8E10   D6DB jsr     
8E13      0 lda.zx
8E15      1 ora.zx
8E17      5 beq     8E1E v
8E19   A19F jsr     then
8E1C     F5 bra     8E13 ^
8E1E        inx
8E1F        inx
 ok
SamCoVT commented 5 months ago

I think that looks pretty good.

patricksurry commented 5 months ago

i guess - + might be more obvious than ^ v but i think I still prefer the latter