no-context / moo

Optimised tokenizer/lexer generator! 🐄 Uses /y for performance. Moo.
BSD 3-Clause "New" or "Revised" License
814 stars 65 forks source link

How can I filter out whitespace and newlines? #149

Closed mspoulsen closed 3 years ago

mspoulsen commented 3 years ago

Hi,

I am using moo with nearley quite a bit and I find that parsing white spaces and new lines very easily leads to subtle ambiguities that can be time consuming to track down. My solution has been to filter out all WS and NL tokens which has made life easier. However, I feel that I do it in quite a hacky way inside my nearley grammar:

const ignore = [ "WS", "NL" ]

lexer.next = (next => () => {   
  let token;
  while ((token = next.call(lexer)) && (
    ignore.includes(token.type)
  )) {}
  return token;
})(lexer.next);

Is there a better way? It would be nice if the lexer had a filter function.

I found this issue: https://github.com/no-context/moo/issues/24 and it looks similar. Can I achieve what I am trying to do with itt? If so, could anyone provide a small example?

Thank you very much in advance!

nathan commented 3 years ago
for (const tok of itt(lexer).reject(t => ['WS', 'NL'].includes(t.type))) { ... }