Consider this example. When using literals directly like in grammar1, the transformer does not contain them in its children despite the parser having the same TerminalDef('SLASH', '/') in both grammars
from lark import Lark, Transformer
grammar1 ="""
start: "bos" date "eos"
date: DIGIT+ "/" DIGIT+ "/" DIGIT+
DIGIT: /[0-9]/
"""
grammar2 ="""
start: "bos" date "eos"
date: DIGIT+ SLASH DIGIT+ SLASH DIGIT+
DIGIT: /[0-9]/
SLASH: "/"
"""
class MyTransformer(Transformer):
def date(self, children):
print("Callback for date:", children)
return children
parser = Lark(grammar2, parser='lalr', transformer=MyTransformer())
tree = parser.parse("bos18/11/2023eos")
with grammar1 slash are missing:
Callback for date: [Token('DIGIT', '1'), Token('DIGIT', '8'), Token('DIGIT', '1'), Token('DIGIT', '1'), Token('DIGIT', '2'), Token('DIGIT', '0'), Token('DIGIT', '2'), Token('DIGIT', '3')]
with grammar2 (explicitly defined as terminal) they appear:
Callback for date: [Token('DIGIT', '1'), Token('DIGIT', '8'), Token('SLASH', '/'), Token('DIGIT', '1'), Token('DIGIT', '1'), Token('SLASH', '/'), Token('DIGIT', '2'), Token('DIGIT', '0'), Token('DIGIT', '2'), Token('DIGIT', '3')]
Consider this example. When using literals directly like in grammar1, the transformer does not contain them in its children despite the parser having the same TerminalDef('SLASH', '/') in both grammars
with grammar1 slash are missing: Callback for date: [Token('DIGIT', '1'), Token('DIGIT', '8'), Token('DIGIT', '1'), Token('DIGIT', '1'), Token('DIGIT', '2'), Token('DIGIT', '0'), Token('DIGIT', '2'), Token('DIGIT', '3')]
with grammar2 (explicitly defined as terminal) they appear: Callback for date: [Token('DIGIT', '1'), Token('DIGIT', '8'), Token('SLASH', '/'), Token('DIGIT', '1'), Token('DIGIT', '1'), Token('SLASH', '/'), Token('DIGIT', '2'), Token('DIGIT', '0'), Token('DIGIT', '2'), Token('DIGIT', '3')]
Is this behavior intended?