Closed RevanthRameshkumar closed 1 year ago
Likewise, if I run this code
from lark import Lark
parser = Lark(r"""
%ignore /[\t \f]+/ // WS
start: name|classdef
classdef: "class" name ":"
name: NAME | "match" | "case"
NAME: /[^\W\d]\w*/
""", parser="lalr")
input_str = r"""class"""
interactive = parser.parse_interactive(input_str)
print(interactive.exhaust_lexer())
print(interactive.accepts())
I get
[Token('CLASS', 'class')]
{'NAME', 'CASE', 'MATCH'}
which isn't exactly right because 'class' can just be a name, which means that any letter or number is also acceptable as a continuation
The interactive parser works on the level of Tokens
, not on the level of Characters. You will have to work a bit harder, for example by remembering the last token and checking if there are other characters you can append to that and still get a working regex.
This wont necessary help with the class
example. That isn't really fixable, you will have to special case Identifiers if you want to try and create a general solution.
That helps, I realized I probably need an fsm based thing and then I ran into your interegular module @MegaIng!
I want to use the interactive parser to see if the next letter in a stream is acceptable. If you run this code, the next accepted token is ":" But actually, the next accepted characters are colon, and a continuation of NAME which is letters and numbers:
Is there a way to determine the next possible character here in an efficient way? The only way I can think of offhand is to append each char possibility to the string and re-run the parser which seems horrible.