kach / nearley

📜🔜🌲 Simple, fast, powerful parser toolkit for JavaScript.
https://nearley.js.org
MIT License
3.57k stars 231 forks source link

No parsing error on non-terminated expression #598

Closed imnotteixeira closed 2 years ago

imnotteixeira commented 2 years ago

Hi all, I'm seeking some guidance on something that should be simple, as it is common in many grammar, or so I think...

I'm defining a grammar that let's us write functions in prefix notation, wrapped by parenthesis, which can take arguments that themselves can be other functions, and so on.

I started with something simple, but I soon realized that if I did not close all the parenthesis at the end of the string (making sure that the opening parenthesis were not the same number as the closing ones), the parser returns [] as a result, instead of some error complaining at the missing parenthesis, which are "mandatory" in this grammar.

As I was trying to debug the issue, I found this example in the repository about parenthesis mismatch check (a grammar that ensures that the opened and closed parenthesis are the same number). Great, just what I was looking for... Except it has the same problem as my initial grammar... :/

Below I'll put the parenthesis grammar, as well as the railroad, and the input strings I used.

Can someone tell me if this is by design, or is it a bug in the grammar that can be fixed? Thanks!

Grammar:

@{% function TRUE (d) { return true; } %}

P ->
      "(" E ")" {% TRUE %}
    | "{" E "}" {% TRUE %}
    | "[" E "]" {% TRUE %}
    | "<" E ">" {% TRUE %}

E ->
      null
    | "(" E ")" E
    | "{" E "}" E
    | "[" E "]" E
    | "<" E ">" E

My grammar (still on early development) is the following (just for reference): https://pastebin.com/a3Y64k8K

SpyGuy0215 commented 2 years ago

I have the same issue, its kinda annoying that I can make a PEMDAS parser and feed 1+ and it works...like what???? That makes no sense! I think the best thing to do here is to do some sort of postprocessor to see if the expression is complete; if not, then throw some sort of error. However, I still believe that this should be part of nearley in the first place.

imnotteixeira commented 2 years ago

Hi @SpyGuy0215, after some digging, I think this is by design. Look at https://nearley.js.org/docs/parser#catching-errors, specifically this:

After feed() finishes, the results array will contain all possible parsings.

If there are no possible parsings given the current input, but in the future there might be results if you feed it more strings, then nearley will temporarily set the results array to the empty array, [].

Maybe I need a different parser, which is non-streaming, maybe I need to handle these edge cases myself, but I don't think it is an error in nearley itself.