Open manux81 opened 3 years ago
I'm running into this issue too. Any update on getting this merged in?
Hi @jpsnyder, no @dabeaz still has not been reply to me.
Apologies on the delay. I do NOT see this change making it into SLY because of its special purpose nature and potential impact on performance. However, it is possible to achieve the desired effect using a function:
keywords = { 'if', 'else', 'while', }
@(_r'[a-zA-Z_][a-zA-Z0-9_]*')
def ID(self, t):
if t.value.lower() in keywords:
t.type = t.value.upper()
return t
I would just note, that I'm going to keep this open for now. Even though I'm leaning towards not doing this, I might reconsider. I just don't have any timeline for it. In the meantime, use the function as a workaround.
Fair enough. That workaround works for me. (And actually is better because I can customize the generated token to my liking.)
My only argument for accepting the MR, would be that it would seem to be expected behavior for token remapping to be case-insensitive if the user set that re flag to IGNORECASE. Principle of least surprises.
I agree with @jpsnyder and also using reflags way is sound like flex approch.
While I'm not sure on the performance hit for something like this, but would it make sense for the string value within the brackets for token remapping actually be a regex pattern like everything else? In which case, the regex pattern would be applied to the matches of the main token being remapped. (Make use of the .fullmatch() function)
This would solve the case sensitivity issue and provide more flexibility while still technically supporting the old way. It could also allow us to apply different re flags to the token mapping than what is set globally.
ID["if"] = IF
ID["else"] = ELSE
ID["(?i)while"] = WHILE # case-insenstive mapping
ID["def[0-9]"] = DEF # def followed by a number
On the other hand, this is probably feature creep...
With reference to: https://github.com/dabeaz/sly/issues/62 I've found a problem with tokens remapping in the eventuality of: reflags = re.IGNORECASE This change allows creating special cases ignoring capitalization in the eventuality of reflags = re.IGNORECASE is set in Lexer class.
Example: class MyLexer(sly.Lexer): reflags = re.IGNORECASE
Base ID rule