Closed newtang closed 6 years ago
If you really need to do this in a lexer, you can use lookahead:
const lexer = moo.compile({
delim: /[\/-]+/,
month: /(?:[01]\d|[1-9])(?=[\/-])/,
year: /\d{4,}|\d{2}/, // Years can be more than 4 digits long, by the way
})
But you should probably just match on lex (?:[01]\d|[1-9])[\/-](?:\d{4,}|\d{2}))
and parse that further when you need to, or\d+
and [-/]
and handle the higher-level syntax and validation in your parser.
(The example you gave, 10/2
, doesn't fit your original description of mm/yyyy
or mm/yy
; I assume you mean 10/02
.)
Edit: Note that the above lexer will do bizarre things if you give it input like 10/891/
(= 10
/
89
1
/
). All the more reason to simplify your lexer to \d+
and [-/]
.
Got it, thanks @nathan!
If I wanted to parse something like month and year (e.g 10/2018 or 10/18), how can I make distinct tokens for month and year without the parser getting confused? (I'm using Moo with Nearley).
I'll get an error:
If I reverse the order of month and year to the object I pass to moo.compile, I would get this error:
What is the best strategy for handling something like this? I prefer keeping the distinct tokens so I can catch errors.