SamCoVT / TaliForth2

A Subroutine Threaded Code (STC) ANSI-like Forth for the 65c02
Other
29 stars 5 forks source link

underflow error could report source word? #34

Closed patricksurry closed 6 months ago

patricksurry commented 6 months ago

i'm very thankful for the underflow detection; i regularly trigger it :)

however often after i've made several changes it's still painful to debug where the underflow occurred.

it struck me that the typical calling pattern like this jsr landing here leaves something close to the words xt_... address as its return address on the stack.

it would be super helpful if the error routine could either (a) just report that return address, or even better (b) report the name of the word with the first preceding xt address.

does that sound feasible?

SamCoVT commented 6 months ago

It's an interesting thought. You'd need the actual XT if you want the name, but it will always be 3 bytes back from the return address on the stack (technically 2 bytes back, as the 6502 stores a return address on the stack one byte short of the instruction following the JSR because it gets incremented during RTS, IIRC). From there you can use INT>NAME and can then use TYPE to print it out. I do think it's feasible. You might get an error like:

Stack underflow before: dup

It won't tell you which dup in your word it's talking about - I think that would be much more difficult - but it's likely to still be useful.

SamCoVT commented 6 months ago

You'd also need to fix the stack first, as this will need words that check the stack for underflow :)

SamCoVT commented 6 months ago

See disasm_jsr in disassembler.asm for an example of printing the name. It looks like you feed the XT to int>name to get the NT (name token - that's the dictionary entry itself, but the name is not right at the beginning), then feed that to name>string to get the address and length that can then be typed.

patricksurry commented 6 months ago

oh interesting, right it would have the native word that triggered. in this subroutine threaded implementation, would the 6502 stack via sp contain the sequence of the return addresses of the forth words that led to the underflow, ending with the native word that triggered the error?

if so just dumping SP...$ff might give you a primitive stack trace which you could work backwards to word names as you describe. i might fool around with that

SamCoVT commented 6 months ago

It looks like disasm_jsr has an rts right at the end, so I think you can just put your XT in scratch+1 and scratch+2 and do a jsr disasm_jsr (after fixing the stack, of course). The routine comments talk about what is currently on the stack, but it leaves whatever is on the stack alone (and should work with an empty stack - it takes the needed address in scratch+1 and scratch+2). If the XT doesn't have a name, it prints nothing, but I don't think you'll even get down that far because all of the words that have underflow checking are named.

There may also be other junk on the stack (like loop indices and stuff put there with >R), but yes, the stack will have all the return addresses there too. A stack dump might not be silly at all.

patricksurry commented 6 months ago

sweet!

here's a proof of concept: after inserting a few lines in taliforth.asm:underflow_error (below) we make a word that generates an underflow. the underflow error dumps the return stack ($100+sp .. $1ff)

bar ($369a) calls foo ($3673) which calls drop generating the underflow. the stack trace has $368b (the call to drop inside foo), then $369c (the call to foo inside bar) and some other stuff.

with some hacking of words&sizes you could find the nearest xt preceding each entry.

: foo ." oops" CR drop ;  ok
: bar foo ;  ok
hex ' bar u. ' foo u. 369A 3673  ok
bar oops
8B 36 9C 36 3C CE 2C F7 B0 C1 Stack underflow

code to dump return stack:

underflow_error:
                tsx         ; data stack is already borked, display return stack
_underflow_stack_trace:
                inx         ; stack dump to help reconstruct calling words
                beq +
                lda $100,x
                jsr xt_space
                jsr byte_to_ascii
                bra _underflow_stack_trace
+
                jsr xt_space
SamCoVT commented 6 months ago

So the issue is going to be when words are natively compiled. Using your example (addresses will be different):

hex  ok
: foo ." oops" CR drop ;  ok
: bar foo ;  ok
hex ' bar u. ' foo u. 832 80B  ok
see bar
nt: 827  xt: 832
flags (CO AN IM NN UF HC): 0 0 0 1 0 1
size (decimal): 3

0832  20 0B 08   ..

832    80B jsr     foo
 ok
see foo
nt: 800  xt: 80B
flags (CO AN IM NN UF HC): 0 0 0 1 0 1
size (decimal): 27

080B  4C 12 08 6F 6F 70 73 20  A9 A0 0E 08 04 00 20 03  L..oops  ...... .
081B  A5 A9 0A 20 F6 8D 20 2F  D8 E8 E8  ... .. / ...

80B    812 jmp
812   A0A9 jsr     SLITERAL 80E 4
819   A503 jsr     type
81C      A lda.#
81E   8DF6 jsr
821   D82F jsr     STACK DEPTH CHECK
824        inx
825        inx
 ok

The CR is natively compiled to loading A with $A (technically a linefeed) and calling the emit_a routine to print the character in a (this routine does not have a name in the dictionary). Then the drop is natively compiled, but it has the stack depth check. You can use see CR and see drop to see the assembly for those words individually. If a bunch of words are natively compiled, it's impossible to tell where one ends and the next begins. Also, the stack depth check is not at the beginning of the word, so it's more difficult than just backing up a bit.

We could recommend setting nc-limit (native compiling) to zero for troubleshooting, which forces all words to be compiled as JSRs. That makes the words look like this:

hex  ok
0 nc-limit !  ok
: foo ." oops" CR drop ;  ok
: bar foo ;  ok
see foo
nt: 800  xt: 80B
flags (CO AN IM NN UF HC): 0 0 0 1 0 1
size (decimal): 23

080B  4C 12 08 6F 6F 70 73 20  A9 A0 0E 08 04 00 20 03  L..oops  ...... .
081B  A5 20 63 89 20 2E 8D  . c. ..

80B    812 jmp
812   A0A9 jsr     SLITERAL 80E 4
819   A503 jsr     type
81C   8963 jsr     cr
81F   8D2E jsr     drop
 ok
see bar
nt: 823  xt: 82E
flags (CO AN IM NN UF HC): 0 0 0 1 0 1
size (decimal): 3

082E  20 0B 08   ..

82E    80B jsr     foo
 ok

and now your stack trace should show you that it was drop that triggered the underflow.

Doing something like words&sizes is certainly possible, but that was written before Tali supported wordlists so it only searches the CURRENT wordlist. If you look at xt_int_to_name: you will see that its more complicated now because we'd need to run though all of the wordlists to locate the word. That might be too complicated. Your stacktrace and see might be enough to be useful, though.

SamCoVT commented 6 months ago

If we want to try to print the names of the words while dumping the return stack trace, I recommend implementing TRAVERSE-WORDLIST. Much of the code would be taken from INT>NAME and that word should be re-written to use TRAVERSE-WORDLIST in order to keep code size under control. Then, a helper word could be written that takes an address and stops on the word that has that address within it, printing its name. That seems like a useful word to have anyways. If we expose (create a named word for) num_wordlists (perhaps #WORDLISTS in Forth), then this kind of thing could all be written in high level forth.

I think just providing a stack trace it a good start. The messages will need to be adjusted to it's clear what those bytes are.

Is that something you are interested in doing (you already have some working code), or would you like me to tackle that?

Do you know how to run the test suite? It does rely on py65mon, but you can just run make tests on the command line - it takes several minutes and will print the results at the very end. It uses Tali to test itself, and is pretty good at showing when you have inadvertently broken something. There is also a Linux only make ptests that runs all of the test sets in parallel and will happily use all of your cores for a noticeable speedup.

I might be interested in the C-based simulator that you use - if it can be extended to provide custom "pretend hardware" (needed for letting Tali cycle count itself), it might provide a big speedup for these tests.

patricksurry commented 6 months ago

I'm happy to make a PR containing my minimal fix of just dumping the return stack prior to printing the underflow message. I'm terrible at naming things, what would you suggest in terms of message? How about this:

CPU stack:  8B 36 9C 36 3C CE 2C F7 B0 C1
Data stack underflow

I haven't run the tests yet but will try that before I submit.

I'll make a separate PR with the C simulator code. It is much faster, includes a block device hook, and dumps a memory heatmap on exit which is useful for profiling (like when I was understanding why my forth was so much slower than Tali :-)

SamCoVT commented 6 months ago

That sounds good. I'll suggest "Return Stack: " so that it's clear we're not trying to show data from the underflowed Data Stack, but "CPU stack: " is not unreasonable. The strings go into strings.asm - let me know if you need any help figuring out how to do that, but I think you'll be able to just look at the ones that are there and add your own (down in the err_xxxx numbers and es_xxxx error strings). You need to give it a name/number, add it to the error_table, and then put the string data below that.