Closed osa1 closed 2 years ago
I think we will need a special symbol, maybe eof
, to match end-of-input.
The question is whether to make it a regex, or a LHS.
If we make it a regex then we allow nonsensical regex like eof+ 'a'
(match one or more "end of input", then character 'a') so I don't like this too much.
If we make it a LHS then it will be similar to _
in how we use it and handle it in the implementation. The example above will look like:
lexer! {
Lexer -> &'input str;
rule Init {
"//" => |lexer| {
lexer.switch(LexerRule::SingleLineComment)
},
}
rule SingleLineComment {
'\n' => |lexer| {
let comment = lexer.match_();
lexer.switch_and_return(LexerRule::Init, comment)
},
eof => |lexer| {
let comment = lexer.match_();
lexer.switch_and_return(LexerRule::Init, comment)
}
_,
}
}
Since we cannot do '\n' | eof
(because eof
is not a regex) this has a little bit duplication, but I think it's not too bad.
Note that we don't need to match eof
in the Init
rule, as we have a special case in Init
and handle eof
to return None
in the next
method.
(See also #12 for another different behavior in initial and non-initial states)
Suppose I want to lex C-style single line comments:
// ...
This won't lex EOF-terminated comments because
_
does not match EOF:I don't know if we should make
_
match EOF, or have another symbol for matching EOF explicitly.