Open afreeland opened 4 months ago
Hello, thanks for sharing your issue!
First, let me suggest this simpler MWE, so it is easier to debug:
use logos::Logos;
#[derive(Debug, Logos)]
enum Token {
#[token("alert")]
Action,
#[token("tls")]
Protocol,
#[regex(r"([^\s]+) ([^\s]+) (->|<-) ([^\s]+) ([^\s]+)")]
NetworkInfo,
}
fn main() {
let input = "alert tls $HOME_NET any -> $EXTERNAL_NET any (msg:\"some bs\")";
let mut lexer = Token::lexer(input);
while let Some(token) = lexer.next() {
println!("{:?}", token);
}
}
Second, I think this is a duplicate of #358, and maybe #265. Hopefully, the bug fix mentioned in #265 by @jameshurt might solve this, but I am waiting for a reply :-)
I'm new to Rust and new to Logos, so this could just be me...but when using regex is seems like it always stomps on the other tokens. The snippet below has essentially calls out three tokens, one represents an action
alert
, another is protocoltls
and then network information.Here is the code with a regex
This outputs:
However, if I comment out the
NetworkInfo
section, myAction
andProtocol
will work just fine. Output:This part of the input
$HOME_NET any -> $EXTERNAL_NET
represents a source host, source port, direction, destination host and destination port. These things are pretty fluid so outside of regex, not really sure of how I would go about targeting them.Is there a way to have regex not overpower everything around it...or am I doing something incorrectly? I read the token-disambiguation but couldn't seem to find a way to lower regex priority.