jasonish / evebox

Web Based Event Viewer (GUI) for Suricata EVE Events in Elastic Search
https://evebox.org/
MIT License
418 stars 67 forks source link

Misparsed exceptions from options #176

Closed vorner closed 3 years ago

vorner commented 3 years ago

Hello

I'm using the evebox-suricata-rule-parser (without the rest of evebox) to parse some suricata rules. It seems the parsing doesn't account for exceptions/negations in the rule options, for example described here:

https://suricata.readthedocs.io/en/suricata-6.0.0/rules/payload-keywords.html#content

(The content:!"Firefox/3.6.13";).

The parser puts the exception mark inside the val, which is IMO wrong (though the documentation of suricata is not very formal in that regard). Putting it inside makes it impossible to make distinction between negated content check and content that shall start with a literal exclamation mark (eg. content: "!something" vs `content: !"something").

I wonder what the right way to fix this would be ‒ can we add another field inside the RuleOption that describes this?

fn main() {
    const RULE: &str = r#"alert http $HOME_NET any -> $EXTERNAL_NET any (msg:"Outdated Firefox on Windows"; content:"User-Agent|3A| Mozilla/5.0 |28|Windows|3B| "; content:"Firefox/3."; distance:0; content:!"Firefox/3.6.13"; distance:-10; sid:9000000; rev:1;)"#;
    let rule = evebox_suricata_rule_parser::parse_rule(RULE).unwrap();
    dbg!(rule);
}

This prints:

...
        RuleOption {
            key: "content",
            val: Some(
                "!Firefox/3.6.13",
            ),
        },
...
jasonish commented 3 years ago

As this little library is more of a tokenizer than anything else right now, I think the best thing to do is to retain the quotes so you'd end up with

...
        RuleOption {
            key: "content",
            val: Some(
                "!\"Firefox/3.6.13\"",
            ),
        },
...

I think this fits better with a larger parser that attempts to give meaning to each option. In my head, and a future version of the parser this might be converted to:

enum RuleOption {
    // msg:
    Message(String),
    // Content: unquoted value and whether to negate or not.
    Content { value: String, negate: bool },
    // Unknown option with raw value
    Other(String),
}

But until we get there I think leaving the quotes as-is might be the best fix. Thoughts?

vorner commented 3 years ago

As this little library is more of a tokenizer than anything else right now, I think the best thing to do is to retain the quotes so you'd end up with

I take the point of not assigning a meaning to things and it fits into what the library generally does. But having to deal with the quotes and such would be a bit of a step back. As a possible idea, how about extracting whatever is between the : and the actual value (or even with the :) into a separate field, like this? That would also be backwards compatible change, unlike the addition of quotes into the value.

RuleOption {
    key: "content",
    val: Some("Firefox/3.6.13"),
    separator: ":!",
}

But I wonder what the right solution for dsize:>268; should be, is the > part of the value, or of the separator? :thinking:

jasonish commented 3 years ago

I could see treating the stuff before the opening quote as a prefix (not really a separator). This would help with negation. However, something like dsize: >258, the >258 is what I'd call an expression and should not be subject to that, but instead would be evaluated by an expression parser somewhere else. So you'd get

    key: "content",
    val: Some("Firefox/3.6.13"),
    // In case of quoted fields, this is the data before the quotes.
    prefix: Some("!"),

Still unsure about this as what is here is intended to parse the structure of the rule, not the validity of the rule. I consider the : of the key/value separator structural, so it will be stripped from the output structs being a token separator.

I'm wondering if the RuleOption should become an enum sooner than later where each option has its own variant with fields that make sense for it. This comes with its own issues that I'm not sure I want to tackle quite yet.

vorner commented 3 years ago

I consider the : of the key/value separator structural, so it will be stripped from the output structs being a token separator.

That would work fine for my use case.

I'm wondering if the RuleOption should become an enum sooner than later where each option has its own variant with fields that make sense for it.

Like all the content and pcre and such would each be its own separate enum variant? I'd be afraid of the fact that there's a lot of them and they'll probably keep adding more over time.

jasonish commented 3 years ago

Yeah, so something like:

enum RuleOption {
    Generic(SomeStructForUnknown),
    Message(String),
    Content(ContentStruct),
    Pcre(...),
    Sid(...),

the main problem I see is its a lot of work to be exhaustive. And then when you take some field and migrate it from Generic to its own variant you might break users of the library unless you are exhaustive from the start. If we ever move the parsing code in Suricata to Rust we'll likely have an exhaustive enum there as its a source of truth type place. But no plans for that anytime soon.

I'll add this prefix type stuff real soon. Seems important to deal with.

vorner commented 3 years ago

If I'd like to move this forward, would it help if I send a pull request?

jasonish commented 3 years ago

If its the prefix, yes a PR would be useful. Thought I might get to it in the next few days as well.

jasonish commented 3 years ago

See https://github.com/jasonish/evebox/pull/182.

vorner commented 3 years ago

Thank you, yes, this looks great :-).

I'm looking forward for the next release.

jasonish commented 3 years ago

Merged into master. I'll likely do a publish to crates.io today as well.

Commit 0fcb38527b87200e5844ae7c59c42bbcbcce1665.