siraben / tree-sitter-promela

Promela grammar for tree-sitter
MIT License
7 stars 0 forks source link

Incorrect parsing of if blocks without a semicolon #3

Closed siraben closed 2 years ago

siraben commented 2 years ago

The following correctly parses

init {
    if 
    :: y = 3;
    fi;
    x = 3;
}

but this does not

init {
    if 
    :: y = 3;
    fi
    x = 3;
}

It appears that the if node did not succeed, and the partial parse for the last statement is incorrectly parsed as a variable declaration instead of an assignment.

(program [0, 0] - [6, 0]
  (ERROR [0, 0] - [5, 1]
    (step [2, 7] - [2, 12]
      (Stmnt [2, 7] - [2, 12]
        (assignment [2, 7] - [2, 12]
          (varref [2, 7] - [2, 8]
            (cmpnd [2, 7] - [2, 8]
              (pfld [2, 7] - [2, 8])))
          (full_expr [2, 11] - [2, 12]
            (expr [2, 11] - [2, 12]
              (const [2, 11] - [2, 12]
                (number [2, 11] - [2, 12])))))))
    (step [3, 4] - [4, 9]
      (one_decl [3, 4] - [4, 9]
        (var_list [4, 4] - [4, 9]
          (ivar [4, 4] - [4, 9]
            (vardcl [4, 4] - [4, 5]
              (uname [4, 4] - [4, 5]))
            (expr [4, 8] - [4, 9]
              (const [4, 8] - [4, 9]
                (number [4, 8] - [4, 9])))))))))
siraben commented 2 years ago

This could result from the one_decl rule being too highly prioritized. The partial parse tree reads something like <type> x = 3;

siraben commented 2 years ago

@nimble-code are semicolons optional at the end of if blocks? In the grammar it seems that if there are 2 or more statements there must be a semicolon separating them (the last being optional.)

nimble-code commented 2 years ago

interesting case -- it does look like the parser expects a semi-colon after the fi, but it shouldn't

siraben commented 2 years ago

As far as I can tell from the Spin examples, there isn't an instance where the if block has a semicolon omitted, so I think this code is invalid:

init {
    if 
    :: y = 3;
    fi
    x = 3;
}
nimble-code commented 2 years ago

I'll see if I can fix it, to make semi-colons optional there, but for the time being this will have to be the work-around

siraben commented 2 years ago

@nimble-code Ok, thanks. Please note that this repository is a re-implementation of the Promela grammar in tree-sitter and I'm taking the YACC grammar as the source of ground truth to replicate. Is there a way to only run the parser in Spin?

FWIW it seems fine if the semicolon after the if block is intentional, as written in the sequence rule

nimble-code commented 2 years ago

I just tried this example, but it parses correctly on my system (Spin version 6.5.1 from 3 june 2021) int x, y init { if :: y = 3; fi x = 3; }

same result if the remaining two semi-colons are also deleted

nimble-code commented 2 years ago

by the way, in the spin parser, "missing" semi-colons are reinserted into the token stream in the lexer, so that the parser (following the yacc grammar) doesn't need to worry about it

siraben commented 2 years ago

@nimble-code I see, under what conditions is the semicolon inserted?

nimble-code commented 2 years ago

at the end of a line, if the preceding statement didn't have a semi-colon (tricky in conditions that are split across multiple lines...)