no-context / moo

Optimised tokenizer/lexer generator! 🐄 Uses /y for performance. Moo.
BSD 3-Clause "New" or "Revised" License
821 stars 65 forks source link

Keyword in the middle of string #69

Closed jannone closed 7 years ago

jannone commented 7 years ago

Hello,

I have a use-case where the language allows the user to type without spaces. So, an expression like this: X=Y AND Z Might be typed like this: X=YANDZ

The problem is that I have a "ident" rule matching variable names, and I use the keyword feature to detect operators such as the "AND" in the example above.

However, I can't seem to find a way for the lexer to detect the AND without spaces. It can only read it as an ident of value "YANDZ".

How can I fix that?

Thanks

deltaidea commented 7 years ago

If there's not many of such keywords, the simplest solution I can think if is negative lookahead:

ident: /[a-zA-Z]+(?!(AND|OR|...))/,
operator: ['AND', 'OR', ...],

Otherwise, I think you'll have to make a custom tokenizer that does two passes or something. I'm not sure.

nathan commented 7 years ago

/[a-zA-Z]+(?!(AND|OR|...))/

That won't work. Try /(?:(?!AND|OR)[a-zA-Z])+/.

tjvr commented 7 years ago

That looks right to me! :-)

jannone commented 7 years ago

@deltaidea and @nathan It worked! Thanks!