lark-parser / Lark.js

Live port of Lark's standalone parser to Javascript
MIT License
71 stars 12 forks source link

SyntaxError: Invalid regular expression returned when parsing #21

Open jillyj opened 2 years ago

jillyj commented 2 years ago

Grammar file: https://github.com/opencybersecurityalliance/kestrel-lang/blob/develop/src/kestrel/syntax/kestrel.lark Generated parser: kestrelParser.js.zip

When parsing this statement procs2 = GET process abc, the parser throws the exception like below which is not caught by parser. image

Code

function handle_errors(e) { return true; }

  try {
    treeData = parser.parse(text, null, handle_errors).children[0];
  } catch (e) {
    console.debug("uncaught error:", e)
  }

Expected: This kind of error can be handled by the parser, so we can get the parsing tree and the error info like Unexpected character or Unexpected Token.

jillyj commented 2 years ago

@erezsh could you please take a look at this issue?

erezsh commented 2 years ago

Sorry, I had a few busy weeks.

I'll give it a look.

erezsh commented 2 years ago

This happens because Javascript's regex implementation doesn't support all the features that Python has.

This might take a bit longer to fix.

Meanwhile, a possible work around is to change

FUNCNAME: (MIN|MAX|SUM|AVG|COUNT|NUNIQUE)

to

funcname: (MIN|MAX|SUM|AVG|COUNT|NUNIQUE)
jillyj commented 2 years ago

Thank you for the response. Looking forward to your fix. :)

jillyj commented 2 years ago

This happens because Javascript's regex implementation doesn't support all the features that Python has.

This might take a bit longer to fix.

Meanwhile, a possible work around is to change

FUNCNAME: (MIN|MAX|SUM|AVG|COUNT|NUNIQUE)

to

funcname: (MIN|MAX|SUM|AVG|COUNT|NUNIQUE)

Do you mean to edit the generated parser.js? However, I could not find the string.

 FUNCNAME: (MIN|MAX|SUM|AVG|COUNT|NUNIQUE)
erezsh commented 2 years ago

No... edit the grammar!

jillyj commented 2 years ago

Got it. Thanks! Let me try.

jillyj commented 2 years ago

yeah, updating the grammar from FUNCNAME to funcname works. Looking forward to your fix. :)

erezsh commented 2 years ago

The "fix" is most likely going to be preventing users from doing what you were trying to do and throwing an error instead.

I don't know if there is a way to make it work in Javascript. At least, without implementing part of the regex mechanism myself.

jillyj commented 2 years ago

I got it. Thanks. Would you mind to release a version which contains other fixes first?

erezsh commented 2 years ago

Released https://github.com/lark-parser/Lark.js/releases/tag/0.1.3