Closed Daniel63656 closed 3 months ago
Yes. See this example: https://github.com/lark-parser/lark/blob/master/examples/advanced/custom_lexer.py
You need to add Token types to your list of strings and construct lark.Token
instances, otherwise lark has no idea what to do with your strings. This is the primary job the lexers. In your case, the corresponding token types are all just str.upper
, so Token(c.upper(), c)
constructs the correct token. For your actual usecase, you probably will need to do something more complex.
I already have my tokens as a list of terminals, like this in a toy grammar:
start: A B C A: "a" B: "b" C: "c"
tokens = ["a", "b", "c"]
Can I use a parser that accepts this lists to prevent the unnecessary scanning step? All lexers throw TypeError: expected string or bytes-like object, got 'list