kach / nearley

πŸ“œπŸ”œπŸŒ² Simple, fast, powerful parser toolkit for JavaScript.
https://nearley.js.org
MIT License
3.57k stars 231 forks source link

Wrong input lines on error #634

Open agladysh opened 1 year ago

agladysh commented 1 year ago

long-text.txt

1
2
3
4

Error message:

Error: Syntax error at line 1 col 1:

1  3
   ^
2  4
3  
Unexpected numbers token: "1". Instead, I was expecting to see one of the following:

Note the error message mentions token value "1", but points line with 3. Line numbers are also off.

How to reproduce:

long-text.ne

@preprocessor typescript

@{%
import moo from 'moo';

const lexer = moo.compile({
  newlines: { match: /\n+/, lineBreaks: true },
  numbers: /[0-9 ]+/
});

%}

@lexer lexer

grammar -> chunk:+

chunk -> %newlines | %spaces # no %numbers

long-text.ts

import nearley from 'nearley';

import compiled from './long-text-compiled';

async function main() {
  const grammar = nearley.Grammar.fromCompiled(compiled);
  const parser = new nearley.Parser(grammar);

  for await (const chunk of process.stdin) {
    parser.feed(chunk.toString('utf8'));
  }

  parser.finish();
}

main()
  .catch(async (e) => {
    console.error(e);
    process.exit(1);
  });

Command:

npx nearleyc <long-text.ne >long-text-compiled.ts && npx tsx long-text.ts < long-text.txt

Version:

% npx nearleyc --version
2.20.1
jeremysf commented 1 year ago

I'm seeing something similar using nearley and moo.

In my case, the line number and offset in the error are correct, but the displayed source context in the error is from a completely different part of the input.

In the example below, the error really is a line 66 col 18, but the displayed text of "}" is from the end of the input which is actually line 151.

Syntax error at line 66 col 18:

64      vec2 position = 2;
65      optional string pinReference = 3;    
66  }
                     ^
Unexpected symbol token: "extends". Instead, I was expecting to see one of the following:

A lbrace token based on:
    element β†’ %element %ws %symbol element$ebnf$1 ● %lbrace element$ebnf$2 element$ebnf$3 %rbrace
    declaration β†’  ● element
    unit$ebnf$1 β†’ unit$ebnf$1 ● declaration
    unit β†’  ● unit$ebnf$1

In my case, I'm feeding the parser like this:

const parser = new nearley.Parser(nearley.Grammar.fromCompiled(grammar))
parser.feed(sourceText)
const results = parser.finish()
TheGrandmother commented 8 months ago

I am experiencing this as well

timneedham commented 7 months ago

Hi,

Yeh, it's the same for me as well. Line numbers are spot on, but as said elsewhere: the displayed source context in the error is from a completely different part of the input.

snosenzo commented 3 weeks ago

Also seeing this with our omgidl parsing library that uses nearley and moo here: https://github.com/foxglove/omgidl/tree/main/packages/omgidl-parser Line numbers are correct but source is not.