kach / nearley

📜🔜🌲 Simple, fast, powerful parser toolkit for JavaScript.
https://nearley.js.org
MIT License
3.57k stars 231 forks source link

Difference in parsing when using Moo.js lexer compared to native Nearley #563

Open obscuredvision opened 3 years ago

obscuredvision commented 3 years ago

Nearley version: 2.20.1 Moo.js version 0.5.1

I am noticing a difference in parsing when using a lexer (Moo.js). When using native Nearley, the parser fails on tokens that do not match. Whereas with the lexer it simply returns an empty set.

Here is the native grammar:

input -> minute {% id %}
minute
  -> "0" digit {% d => d.join("") %}
  | [12345] digit {% d => d.join("") %}
digit -> [0-9] {% id %}

Here are the tests: "0" => Fail "1" => Fail "01" => "01" "59" => "59" "60" => Fail

Here is the grammar using the Moo.js lexer

@{%
  const moo = require("moo");
  const lexer = moo.compile({
    digit: /[0-9]/
  })
%}
@lexer lexer
input -> minute {% id %}
minute
  -> "0" digit {% d => d.join("") %}
  | [12345] digit {% d => d.join("") %}
digit -> %digit {% id %}

And here are the tests: "0" => [] "1" => [] "01" => "01" "59" => "59" "60" => Fail

Is this intended? I thought I should get similar results to the native Nearley version. If someone could shed some light on this I would be much appreciated. Thank you.

obscuredvision commented 3 years ago

After reading more thoroughly the documentation I think the part of the reason may be due to this: https://nearley.js.org/docs/parser#catching-errors

However, not sure why there is a difference when using a lexer and when not.