stadelmanma / tree-sitter-fortran

Fortran grammar for tree-sitter
MIT License
30 stars 15 forks source link

Fix most issues with line continuations #81

Closed ZedThree closed 1 year ago

ZedThree commented 1 year ago

End-of-statements are now handled in the external scanner, which allows us to use a bit more logic than the DSL rules allow.

Logic of external scanner now goes like:

  1. skip all leading whitespace
  2. work out if we should end the current statement
  3. work out if we should end the current line continuation
  4. handle number literals
  5. possibly start a new line continuation

Statements are always ended by semicolons and end-of-file, and may be ended by newlines unless a line continuation has started.

Because comments always go to the end of the line, and may appear inside line continuations, it turns out to be easier to treat comments as also ending statements -- unless they appear inside a line continuation.

Line continuations now are essentially always in pairs marking the start and end of the continuation, possibly with comments in-between.

Deals with most of #80

ZedThree commented 1 year ago

This does have a regression: string literals containing & without whitespace before them, e.g. "1234567&89" or even '&'. But "1234567 &89" and ' &' are fine. I have no idea why this would happen!

stadelmanma commented 1 year ago

That is interesting, given the big boost in successfully parsed files the regression might be worth it in the near term and we can just document it as a known issue to be fixed.

ZedThree commented 1 year ago

Ok, I fixed it by moving the string literal parsing to the external scanner. I'll tidy it and push.