kach / nearley

📜🔜🌲 Simple, fast, powerful parser toolkit for JavaScript.
https://nearley.js.org
MIT License
3.59k stars 232 forks source link

#632 Buffer chunks to avoid partial matches in nearleyc #635

Open naraen opened 1 year ago

naraen commented 1 year ago

Issue : https://github.com/kach/nearley/issues/632

Cause : The StreamWriter receives tokens in chunks and passes it to the parser which in turn immediately passes it to a lexer. The default chunk size is 64KB. Token for a production rule could span chunks leading the lexer to match a partial token based on on the incomplete chunk.

Fix : Production rules in the nearley grammar are new line \n delimited. Passing only passing characters upto the newline should always result in a full match. StreamWriter buffers any characters in the chunk past the last newline prepends it to the next chunk.

Implements the Writable._finish to flush the buffer. This addresses grammar files that may not have a newline at the end of the file.