alex / rply

An attempt to port David Beazley's PLY to RPython, and give it a cooler API.
BSD 3-Clause "New" or "Revised" License
381 stars 60 forks source link

What is the equivalent of PLY's "def t_FOO(t): ..."? #15

Open RichardBarrell opened 10 years ago

RichardBarrell commented 10 years ago

In a PLY lexer, I can implement certain weird things such as case-insensitive keywords by defining a function with the same name as I'd normally give the string variable containing the regexp for that token.

For example:

from ply import lex

tokens = ("GREET", "FIGHT", "WORD")
reserved = ("GREET", "FIGHT")

t_ignore = ' +'

def t_error(t):
    raise ValueError("oh noooo")

def t_WORD(t):
    "[a-zA-Z]+"
    upper = t.value.upper()
    if upper in reserved:
        t.value = upper
        t.type = upper
    return t

lexer = lex.lex()
lexer.input("grEEt samuel FIGHT tomato greet potato FIght pOEtRY")
for token in lexer:
    print token

#LexToken(GREET,'GREET',1,0)
#LexToken(WORD,'samuel',1,6)
#LexToken(FIGHT,'FIGHT',1,13)
#LexToken(WORD,'tomato',1,19)
#LexToken(GREET,'GREET',1,26)
#LexToken(WORD,'potato',1,32)
#LexToken(FIGHT,'FIGHT',1,39)
#LexToken(WORD,'pOEtRY',1,45)

I can't find anything in rply's documentation that explains how to do the equivalent of defining t_WORD as a function in the above program. Nor can I find anything that indicates that it can't be done.

tdsmith commented 8 years ago

I ran into this and decided that the way to do it was to wrap the lexer output with a function that would intercept and modify the tokens I wanted additional logic for.