scotws / TaliForth2

A Subroutine Threaded Code (STC) ANS-like Forth for the 65c02
Other
86 stars 23 forks source link

crashes with multiple ." strings in a word #51

Closed SamCoVT closed 6 years ago

SamCoVT commented 6 years ago

Please note that I'm redefining the word "testing" in a couple of these examples on purpose. If I don't put CR into a word, I can put multiple strings into a work like so:

Type 'bye' to exit
: testing  compiled
." string1:"   compiled
." string2               :"   compiled
." string3                            :"   compiled
;  ok
  ok
: testing2  compiled
." string3:"   compiled
." string4:                           :"   compiled
." string5                             :"   compiled
;  ok
  ok
: testing  compiled
." string1:"   compiled
." string2               :"   compiled
." string3                            :"   compiled
;  ok
           ok
testing string1:string2               :string3                            : ok

If I add some "CR"s in between each string, some weird things happen:

Type 'bye' to exit
: testing  compiled
." string1:" cr   compiled
." string2               :" cr   compiled
." string3                            :" cr   compiled
;  ok
  ok
: testing2  compiled
." string3:" cr   compiled
." string4:                           :" cr   compiled
." string5                             :" cr   compiled
;  ok
  ok
: testing  compiled
." string1:" cr   compiled
." string2               :" cr   compiled
." string3                            :" cr   compiled
;  ok
                                      ok
testing string1:
string2               :
string3:
string4:                           :
string5                             :
 ok

and if I fiddle with the lengths of the strings (with CRs), it has very odd behavior and sometimes crashes. I think this is the cause of the crashes you were seeing. Exactly how it crashes depends on the length of the string (I'm playing with the second string). Sometimes it does some weird things and then gives a Stack underflow - other times it crashes the simulator.

Type 'bye' to exit
: testing  compiled
." string1:" cr   compiled
." string2                            :" cr   compiled
." string3                            :" cr  compiled
;  ok
testing string1:
string2                            :
0156  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00 Stack underflow

After some playing around, I think I have a rather simple example, and it doesn't even have any CRs in it:

Type 'bye' to exit
: testing  compiled
." string1string1string1string1string1string1string1string1"   compiled
." string2string2string2string2string2string2string2"   compiled
;  ok
testing string1string1string1string1string1string1string1string10000  
Stack underflow
SamCoVT commented 6 years ago

I think it has to do with strings crossing page boundries. Here is a dump from that final simple example with just the two strings:

\ in hex mode
' testing . 1797  ok
1790  74 65 73 74 69 6E 67 4C__D2_17 73 74 72 69 6E 67 
17A0  31 73 74 72 69 6E 67 31  73 74 72 69 6E 67 31 73 
17B0  74 72 69 6E 67 31 73 74  72 69 6E 67 31 73 74 72 
17C0  69 6E 67 31 73 74 72 69  6E 67 31 73 74 72 69 6E 
17D0  67 31 20 55 94 9A 17 38  00 20 EA 97 4C_11_17 73 
17E0  74 72 69 6E 67 32 73 74  72 69 6E 67 32 73 74 72 
17F0  69 6E 67 32 73 74 72 69  6E 67 32 73 74 72 69 6E 
1800  67 32 73 74 72 69 6E 67  32 73 74 72 69 6E 67 32 
1810  20 55 94 DF 17 31 00 20  EA 97 60 00 00 00 00 00

The word "testing" starts at 1797. It looks like this has a jump to 17D2 which has a little routine to print the string. This routine lives after the string data and appears to start with 20 (a JSR) and end with "20 EA 97 (JSR to $97EA (xt_type))". The second string should be up next, but it's jumping to 1711, which is BEFORE this word somewheres. It looks to me like it should be jumping to 1810 instead, and I see that's beyond a page boundry. That's why the length of the strings matters.

scotws commented 6 years ago

Nice work! After coding the 65816 for a while (Liara Forth), I noticed going back to the 65c02 I had this tendency not to think of page boundries any more, so that sounds like a very possible bug scenario.

I'm not going to get squat done this weekend, so any PR you send are probably going to sit around for a few days, just so you know.

scotws commented 6 years ago

It might be worth trying to handle #20 and #21 at the same time as well.

SamCoVT commented 6 years ago

There is a whole lot to wrap one's head around here. I have a few questions:

Am I correct in that the current behavior for S" is to simply save it to the dictionary (without a header, so it can't be found by a name later) and just leave the address and count on the stack?

Does it ever get removed from the dictionary (I'm guessing no, as that would require some kind of garbage collection scheme)?

If S" is used in a loop, will it make multiple copies of the string (I'm guessing no, as the compile time behavior makes the string while the run time behavior just gets the (same) address and length on the stack every time it is run later)?

If S" is used over and over again interactively will it make multiple copies of the string in the dictionary (I'm guessing yes on this one, as there is no way to search for an existing copy of a constant string).

SamCoVT commented 6 years ago

FYI, Your pace is fine for me. I only get to work on this in fits and starts as well, and the extra time between actions lets me noodle on things for a while before I actually work on them.

scotws commented 6 years ago

@SamCoVT Yes, the string is just saved in the Dictionary area at CP, because the whole standard was a bit too messy at this point ("Store the resulting string in a transient buffer c-addr u" -- which no other word should overwrite, but EVALUATE might not work any more ... argh, see https://forth-standard.org/standard/file/Sq which I'd already included to the normal core word). I figured the memory leak would be no problem for most of our uses, and easier to handle than another buffer that could be overwritten.

As for the loop/interactive behavior, I think you're correct on both counts, but honestly haven't thought about those cases yet.

SamCoVT commented 6 years ago

I think I'd like to try this one, if you haven't started on it. It's complicated enough that I think I'll learn a lot about all of the bits and pieces involved with strings as well as the input buffers. I'm assuming that, to do this correctly, I may need to refill the input buffer mid-string.

scotws commented 6 years ago

Gladly! I'm going to be busy with the test suite for the moment, moving it to separate files etc.