no-context / moo

Optimised tokenizer/lexer generator! 🐄 Uses /y for performance. Moo.
BSD 3-Clause "New" or "Revised" License
821 stars 65 forks source link

Keywords do not support spaces/newlines #79

Closed dselman closed 6 years ago

dselman commented 6 years ago

From some preliminary testing, it looks like keywords must be simple string literals. Is this by design? Is there another (better) way to achieve this?

Sample Nearley grammar using a Moo lexer:

@{%
const moo = require("moo");

const lexer = moo.compile({
  ws:     /[ \t]+/,
  number: /[0-9]+/,
  word: /[a-z]+/,
  times:  /\*|x/,
  SPACE: {match: /\s+/, lineBreaks: true},
  IDEN: {match: /[a-zA-Z]+/, keywords: {
        notice: ['NOTICE TO']
      }},
});
%}

# Pass your lexer object using the @lexer option:
@lexer lexer

# Use %token to match any token of that type instead of "token":
root -> %notice %ws %IDEN
{% (data) => {console.log(data);return data;} %}

Sample input:

NOTICE TO Foo

Output:

invalid syntax at line 1 col 1:

  NOTICE
  ^
Unexpected IDEN token: "NOTICE"
 {"offset":0,"token":{"type":"IDEN","value":"NOTICE","text":"NOTICE","offset":0,"lineBreaks":0,"line":1,"col":1}}
nathan commented 6 years ago

Duplicate of https://github.com/no-context/moo/issues/80