Open Philogy opened 2 months ago
Hello @Philogy, I currently haven't much time to invest in this issue, but I would recommend to you the same thing as I do for all comment
-style lexing: just create a token that matches the start of a comment, and then process the comment with a callback. This is usually much better, as comments can contain almost any characters, like escaped /
, which makes it super hard to write a regex that handles all specific cases.
See https://github.com/maciejhirsz/logos/issues/421#issuecomment-2348285243 for an example on XML comments, which is very similar to multiline strings.
Looks like this is a duplicate of #400, so closing this anyway :-)
Looks like this is a duplicate of #400, so closing this anyway :-)
Looks different to me. #400 seems to error in the derive, whereas this errors at runtime.
For what it's worth, I also encountered a stack overflow/infinite loop at runtime with a small test case:
#[derive(Logos, Debug, PartialEq)]
enum TestToken {
#[regex("c(a*b?)*c")]
Token
}
#[cfg(test)]
mod logos_test {
use logos::Logos;
use crate::TestToken;
#[test]
fn overflow() {
let _ = TestToken::lexer("c").next();
}
}
I toyed a bit with adding a Mermaid output from the lexer.
Looking at the graph from the test above, the issue seems to be that the graph does not handle a loop between nodes when it misses.
flowchart LR
1("::Token")
3("rope#3")
3 -- "a" --> 5
3 --x 6
4("rope#4")
4 -- "b" --> 3
4 --x 3
5("rope#5")
5 -- "a" --> 5
5 --x 4
6("fork#6")
6 -- "b" --> 3
6 -- "c" --> 1
6 --x 3
8("Start")
8 -- "c" --> 3
I was trying to get C-style multiline comment working in logos, the state machine for it is quite simple but I can't seem to get it working in logos. This regex seemed to work but it causes a panic when parsing certain things.
The code for report:
The error I'm getting: