SamCoVT / TaliForth2

A Subroutine Threaded Code (STC) ANSI-like Forth for the 65c02
Other
29 stars 5 forks source link

backlash and paren handling #35

Closed patricksurry closed 5 months ago

patricksurry commented 6 months ago

when I starting using EVALUATE on multi-line text strings, i was confused for a while that \ appears to eat everything afterward including newline (i'm using LF on mac in my source). I also thought that ( parenthesized ) comments would work across multiple lines when pasting source code but that doesn't work either. I think the standard agrees that multi-line parens don't work interactively but should work in blocks (see https://forth-standard.org/standard/file/p).

This would be good to document either way.

A workaround for backslash issue is to pre-process your source e.g. with forth_code/forth_to_orphisbin.py if you want to treat it as a long string (with or without newlines)

I'll recommend opening a separate issue for \ for strings with line endings, as that may require some additional digging into details like what a line ending even is (LF vs CR vs CR+LF) and how to handle the fact that they might be different on different systems (have already run into this on https://github.com/SamCoVT/TaliForth2/issues/27). I expect that parenthesis comments will work if you are EVALUATING a string that contains line ending characters, but it's true that they are not multiline in general and that's because REFILL only reads one line at a time from the keyboard. The forth standard doesn't require them to be multiline, but it also doesn't prohibit them from being multiline. I, personally, would like \ to work in blocks, so I'm likely to be editing that word anyway.

SamCoVT commented 6 months ago

I think it's worth fixing \ . The current behavior is "skip to end of input buffer". The desired behavior is:

if BLK is nonzero, 
    skip >IN ahead to the next multiple of 64 (>IN is the index into the input buffer)
else if BLK is zero, 
    locate the next line ending (looking up to the end of the input (held in ciblen))
    If line ending found, 
        move >IN to just after the line ending (being careful not to go beyond the end of the buffer)

It's a hassle, but I suppose we can can look for both CR and LF as line endings, and can skip to the first non-CR and non-LF character (being careful not to go off the end of the input buffer).

SamCoVT commented 6 months ago

( and ) should already work in blocks, as blocks are space filled and have no line endings at all, and they are also EVALUATED using the entire block buffer as the input buffer. Tali supports the BLOCK word set, but has the CORE version of \ - I think it's worth implementing the BLOCK version of \ (it's in the BLOCK EXT (extended) word set, so it's technically optional). https://forth-standard.org/standard/block/bs

You can also see where Tali's current behavior comes from, as it's described here in the CORE version of \ https://forth-standard.org/standard/core/bs

The FILE version of ( is the only one that REFILLs its input buffer - CORE versions of \ and ( are not allowed to modify their input buffer - they can only move >IN along to "use up" characters. This is because code can move >IN backwards in the input buffer (so that it will be run again), so the content that used to be there still needs to be there. See http://forum.6502.org/viewtopic.php?f=9&t=6908 for examples of this in use. Also, Tali doesn't implement the FILE word set at this time.

SamCoVT commented 6 months ago

I did some digging, and it appears Tali's current behavior (skip all the way to the end of an EVALUATED string that contains newlines) is correct, based on the descriptions of EVALUATE and \. There is a discussion on forth-standard.org about this very topic at the bottom of the page for CORE EVALUATE.
https://forth-standard.org/standard/core/EVALUATE

I also verified that gforth has this same behavior:

Gforth 0.7.3, Copyright (C) 1995-2008 Free Software Foundation, Inc.
Gforth comes with ABSOLUTELY NO WARRANTY; for details type `license'
Type `bye' to exit
s\" 5 . \\ A comment \n 12 ."  ok
2dup evaluate 5  ok

The 5 shows up, but the 12 does not, so it is skipping all the way to the end when it hits the \.

I still want to upgrade \ to the BLOCK version so that it works as expected in blocks. I agree that the other behavior should be documented (where?), but I'm now inclined to leave the behavior in evaluated strings as it is.

Handling newlines in evaluate was discussed in 2022 by the Forth standard comittee, and may appear in a future version of the Forth standard, so I probably could be still be convinced to handle newlines in strings if you feel strongly about it.

SamCoVT commented 6 months ago

I've just upgraded \ from CORE to BLOCK so that it works in blocks. If you still feel strongly about handling newlines in strings, let me know, but I think I'm currently happy with the current behavior, which is Forth-2012 compliant.

bjchapm commented 5 months ago

SamCoVT, thanks very much for the \ upgrade in blocks! Much appreciated.

patricksurry commented 5 months ago

Sounds good to me!

SamCoVT commented 5 months ago

I think that's as far as I'm going to go on that one. \ should now work in blocks and all other behavior and Forth-2012 standard behavior as verified in gforth.