Open bd82 opened 7 years ago
Should possibly be limited to "virtual" Tokens like Indents/outdent So the Lexer will continue handling all the position information.
*poke*
This would be lovely to have. Doing some semantic indentation parsing and haven't a clue as to where to start. Looked up if Python had been implemented as a parser and found this.
Hello @Qix- .
There is an example for dealing with Python like indentation here It is not the prettiest but it works...
This issue is more about cosmetic API changes that would make that example prettier and potentially support more scenarios so it is unfortunately not a high priority.
I'm currently more focused on POCing a scannerless ECMAScript parser and then investigating it's performance versus leading ECMAScript hand built parsers such as Acorn/Esprima. and in the longer term supporting Parser Generator/Combinator type APIs
Another alternative is to use a none Chevrotain lexer (hand built / generated). By creating Chevrotain Tokens or converting to Chevrotain Tokens.
I'm also considering that this issue may be completely redundant if the Chevrotain Lexer would be refactored to output one token at a time (calling .next() many times) instead of tokenizing the whole input in one go then the user logic handling the indentation would not require any special treatment/apis by Chevrotain as it could be implemented completely separately from the regular lexing.
let nextToken
let tokens = []
while (nextToken = myLexer.next()) {
if (nextToken.tokenType === Whitespace.tokenType && ...) {
// indentation handling
// tokens.push(Indent/OutDent)
}
tokens.push(nextToken)
}
Originated from https://github.com/SAP/chevrotain/issues/373 and https://github.com/SAP/chevrotain/issues/414