stadelmanma / tree-sitter-fortran

Fortran grammar for tree-sitter
MIT License
30 stars 15 forks source link

Support line continuations inside string literals #77

Closed ZedThree closed 1 year ago

ZedThree commented 1 year ago

Here's another edge-case for line continuations from my Fortran teaching course:

    print*, "You picked a number between twenty and fifty,&
            & excluding forty-two"

& can appear inside string literals, in which case I'm reasonably certain it must appear as the first character (not column) on the next line.

I've read that tree-sitter doesn't aim for "type-II correctness", which I interpret to mean: "tree-sitter should be able to parse all valid programs, but not necessarily reject all invalid programs". So that might give us some latitude to not worry so much about missing & on the second line

Originally posted by @ZedThree in https://github.com/stadelmanma/tree-sitter-fortran/issues/73#issuecomment-1427798501

ZedThree commented 1 year ago

Is this maybe as simple as:

@@ -1308,13 +1308,13 @@ module.exports = grammar({

     _double_quoted_string: $ => token(seq(
       '"',
-      repeat(choice(/[^"\n]/, /""./)),
+      repeat(choice(/[^"\n]/, /""./, /& *\n *&/)),
       '"')
     ),

     _single_quoted_string: $ => token(seq(
       "'",
-      repeat(choice(/[^'\n]/, /''./)),
+      repeat(choice(/[^'\n]/, /''./, /& *\n *&/)),
       "'")
     ),

This leaves the line continuations in the string literal node.text, but maybe that's correct for the parsing at this stage, and they'd have to be removed later?

stadelmanma commented 1 year ago

I think it would be acceptable for parsing at this stage, since for cases like syntax highlighting I’m fairly certain it’d still work correctly. You probably would run into issues if you used it “as is” for something like automatic source code reformatting but that’s a very advanced use case where your going to have to inspect the text content anyways.

ZedThree commented 1 year ago

Fixed in #76