SigmaHQ / sigma

Main Sigma Rule Repository
Other
8.34k stars 2.2k forks source link

Support for additional operators #218

Closed mbrancato closed 4 years ago

mbrancato commented 5 years ago

Currently, I believe the only operations are for conditions here values are equal. I have use cases where we could benefit to things like greater than / less than operators, or bitwise operators. This is likely a large change to Sigma since the YAML format doesn't have these operators.

Neo23x0 commented 5 years ago

Could you give us examples that would use these operators? Thanks it's unlikely that we're going to support these operators as

mbrancato commented 5 years ago

I have a number of use cases I'd like to use it for. Most deal with enrichment or other custom fields. To list some ideas:

A custom risk score from enrichment DNS or flow size length comparison windowed analysis that is performing any moving mathematical calculation (avg, sum, std, etc) - to work with most sigma backends, something would need to generate an event, but I've written backends for purely stream based tools that could benefit from this too.

AFAIK most standard backends support standard operators like greater than or less than. ArcSight, Splunk, Elasticsearch, QRadar. I'm can't think of one that doesn't, but I'm sure they exist.

bitwise is probably an odd request, but so many fields seem to be hex values of masked settings. The goal was to gain insights into these fields instead of making an exhaustive list.

thomaspatzke commented 5 years ago

@Neo23x0 and I discussed a syntax some time ago (the fieldname@operator thing) that could be used to implement further operators. We will look what makes sense to implement, as many target systems should support the operators offered by Sigma.

mbrancato commented 5 years ago

could they be implemented on the condition line, I think it might be rather easy to support these in a similar manner to the and/or operators today. The biggest problem might be the generation of poor performing permutations when the condition provides selections with lots of fields. That is more of an implementation issue when writing the rule.

jthack commented 5 years ago

Any news here @thomaspatzke? We'd like to do rules such as type: dns and packet_size > 512

thomaspatzke commented 5 years ago

No plans to implement such operators in the next time as I'm concentrating my efforts on bug fixing, code quality and documentation. Would support if someone plans to implement such stuff.

thomaspatzke commented 5 years ago

I pick this issue up for discussion about future development of further operators. Our contributor @christophetd started a discussion here and we're interested in further ideas and opinions:

Summarized currently there are the following alternatives:

  1. Extend the current value modifier syntax by further modifiers that express extended operators like field|len|gt: 100 to express len(field) > 100.
  2. Same as 1 but use different operators for comparison operators: field|len@gt: 100.
  3. Move the whole operation logic into the condition.

Personally I prefer alternative 1 because it:

christophetd commented 5 years ago

In any case I'm afraid in the current state it would collide with the syntax of the newly released value modifiers; if I write field|base64: xy, how do you know if I want to express base64(field) == xy or field == base64(xy)?

Regarding the best way to express it; I agree the first solution seems to be the easiest to implement, but I would argue that moving this to the condition looks more readable.

detection:
  execution:
    EventID: 1
  long_command_line:
    CommandLine|len|gt: 100
  detection: execution and long_command_line

# VS

detection:
  execution: 
    EventID: 1
  condition: execution and len(CommandLine) > 100 

After a bit of investigation, the code changes needed for that look pretty colossal though. If the project goes this way, it might make sense to try and rewrite the condition parser using a standard parsing library like Lark in order to be able to add easily add support for more complex conditions - but again this represents quite a bit of work. I have the feeling that the current parser was initially meant to support a very simple syntax and was progressively augmented to support a more complex grammar, now making it hard to extend. What do you think?

thomaspatzke commented 5 years ago

In any case I'm afraid in the current state it would collide with the syntax of the newly released value modifiers; if I write field|base64: xy, how do you know if I want to express base64(field) == xy or field == base64(xy)?

I don't understand the intention of base64(field) == xy. Is xy Base64 encoded and you want to compare against the encoded value like in this rule? Then you use no modifier at all. But this was one reason for value modifiers: the possibility to express the plain value for better readability and let Sigma do the encoding stuff for you.

Regarding the best way to express it; I agree the first solution seems to be the easiest to implement, but I would argue that moving this to the condition looks more readable.

There's even a shorter way to express the first alternative:

detection:
  execution:
    EventID: 1
    CommandLine|len|gt: 100
  condition: execution

This shows one of the strengths of Sigma: in most cases the implicit logical linking of the detection definitions does what's usually intended in detection writing and you end up with a very simple or even trivial condition. I also prefer the readability of a declarative definition instead of usage of the condition as query. Further I fear that this would end up in some kind of Sigma query language wrapped in a YAML container. What sense makes it to write the execution detection definition in the declarative form when you could also put everything into the condition? Something like this:

detection:
  condition: EventID = 1 and len(CommandLine) > 100 

In my opinion there are already languages like STIX Patterning or Endgame's EQL that go into this direction and offer rich detection capabilities. But remember that the convertibility of Sigma rules into all the supported target query languages results from its simplicity and this results into the ability to use it as sharing language for log signastures. I don't believe that we can keep this property up if we put too much stuff into Sigma. At least not with the development resources that are available to the Sigma project.

I have the feeling that the current parser was initially meant to support a very simple syntax and was progressively augmented to support a more complex grammar, now making it hard to extend. What do you think?

You describe exactly how the condition parser evolved into that what it is today :wink: Even if we finally decide to keep the condition language simple this is something I would like to replace in the future. So thanks for the Lark pointer, it looks good for this purpose.