lark-parser / Lark.js

Live port of Lark's standalone parser to Javascript
MIT License
71 stars 12 forks source link

Possible bug with `parse_interactive` and `unknown_param_0`? #42

Closed thekevinscott closed 6 months ago

thekevinscott commented 6 months ago

When calling .exhaust_lexer() I'm seeing differing behavior between Python and Javascript.

I think I've narrowed it down to this bit of code:

https://github.com/lark-parser/Lark.js/blob/master/larkjs/lark.js#L3761

Which sets LexerThread.state to:

{
unknown_param_0: text,
      start: start,
}

Later on, next_token is called here, which calls this:

while (line_ctr.char_pos < lex_state.text.length) {

Which seems to imply that it expects text to be a string, but in fact text is { unknown_param_0: text, start: start, }.

thekevinscott commented 6 months ago

Conversely, what appears to be the relevant python code is expecting a string.

thekevinscott commented 6 months ago

I tried modifying the relevant Javascript bit to be:

  parse_interactive(text = null, start = null) {
    return this.parser.parse_interactive(text, start);
  }

And it appears to work now. But I'm not familiar enough with the code to know whether doing so introduces other bugs.

erezsh commented 6 months ago

It's just a bug in the automatic conversion. I think this fix should be okay, though I'll have to check to make sure.

You're welcome to open a PR for it.

thekevinscott commented 6 months ago

PR opened here: https://github.com/lark-parser/Lark.js/pull/44