Improved Comparator and Modifier Handling

Hello! First off, this project looks great. Thanks for the work so far.

I'm curious if there is any appetite for:

External Contributions
Improvement on the comparator/modifier implementation.

I was evaluating this library for potential use but noticed that the comparison logic doesn't really comply with the Sigma specification. Specifically, it does not support the globbing patterns in standard field comparisons. I was also looking to see if we could customize this comparison logic because we have some internal requirements that may change the way rules are evaluated.

I'm currently working on some changes that would add a Comparator interface type which allows users of the library to implement their own comparison logic for a RuleEvaluator (and incidentally allow users to extend the library to support other modifiers or disallow certain modifiers). Right now, this looks something like this:

// Compare an expected value to a value found within an event
type CompareFunc func(value interface{}, expected interface{}) bool

type Comparator interface {
    // The base comparator with no modifiers
    Base(value interface{}, expected interface{}) bool
    // Retrieve the comparator for a given set of (possibly empty) modifiers.
    Get(modifiers []string) (CompareFunc, error)
}

With a default implementation which supports the same modifiers as the current implementation, but also adds support for globbing as defined by the spec. It is also a little more concrete, as it uses type-switching to verify the value and expected value types rather than farming out to fmt.Sprintf("%v") (which, imho, could produce unexpected/unwanted results for users who don't understand why/how their types are converting/matching/not matching).

In summary: Is there any appetite for potentially merging an interface like this? Or do you have any notes on the idea which may make the potential future PR more mergable from your perspective? I'm not super far into implementation or anything (basic structure is done, but I haven't done any serious testing yet). I just didn't see any contribution guidelines, and wanted to get a gut check before going to deep into the effort. Thanks! :)

Hey @calebstewart! In short, yes, external collaborators would definitely be welcomed :)

but noticed that the comparison logic doesn't really comply with the Sigma specification

Yeah the spec is so large (and sometimes vaguely defined) that I just tried to implement the smallest subset and expand it as we ran into missing features we needed.

The comparator implementation specifically is quite sketchy: I've currently written it as this chaining of functions system but it's not exactly clear to me if that's always valid (e.g. currently you can try to stack contains and re which won't make any sense)

There's also the new gt and lt operators which would be very useful to implement but will require some sort of type coercion

Are you able to share your use cases for custom comparators?

Sorry for the delay on getting back to you here!

I think the biggest custom comparator use-case I had was the case where an event field contains an array of data. I don't believe this is covered in the Sigma spec at all, but the logical solution that we use is that a field comparison with an array matches if any of the field elements match the given value (e.g. it's treated as an OR against all field elements). As an example, if we had an event that looked like:

{
  "field": [
    "first",
    "second"
    "third"
  ]
}

And a selector that looked like:

selector1:
  field: "second"

Then it would still match. I'm not sure if that's the same behavior everyone would expect, but that is how we have treated the situation. I imagine there are other corner-cases to the comparison that may be different depending on who is using the library, so that's why I suggested a custom comparison implementation.

What we have ended up doing is implementing a pseudo "compilation" layer on top of this package, which pre-compiles the expression(s) and search(es) to golang coroutines, and then simply executes the coroutines when evaluation is requested for an event. This has a couple benefits:

The compiler is flexible. It allows the user to override the core comparison logic prior to compilation as well as add or remove specific modifiers.
Patterns in field comparisons can be pre-compiled (e.g. regular expressions) to simplify evaluation.
It's faster since each search or expression results in a relatively small and simple coroutine with the static arguments/values/patterns already bound to the coroutine.
Allowed us to implement the other comparison features we needed (e.g. keyword matching, faster glob matching, and caseless default comparison).

In the end, we ended up not using the evaluator that is built-in to this library. I would love to eventually discuss merging the compiler upstream, but as it's part of an internal project currently, I'm not sure about the possibility or process there internally. I'll get back to you on that if it's still relevant once our project matures a bit and we can revisit that possibility.

In summary, I don't think it makes sense for me to attempt to implement what I discussed above due to the direction we ended up heading with our project, but I think giving a library user the flexibility to customize the evaluation of events against rules is still valuable. The Sigma specification is geared to the format and intent of the rule itself, and largely ignores direct evaluation since it focuses on converting the rules to a backend which has it's own concrete process for evaluation that differ between backends. With that in mind, the sigma-go evaluator has essentially become a custom "backend" from Sigma's point of view and can define it's own evaluation process separate from the specification (although in this case ideally close to it, obviously).

Ah that array use case is absolutely something we're using this library for at @monzo

To support it, sigma-go takes some liberties with Sigma's fieldmapping support and let's you use jsonpath expressions. So for your example you could have a config like so:

fieldmappings:
    field: $.field[*]

Which would then make your example rule work as expected. Internally this is essentially as if you flattened the array and had a config:

fieldmappings:
    field:
        - field_0
        - field_1
        - field_2
        - ....

In the end, we ended up not using the evaluator that is built-in to this library

Ah so you've got something built on top of the AST that this library outputs? Neat! I did wonder about an approach like that: evaluators optimised for speed in specific cases vs the current generic evaluator

Appreciate the challenges of open sourcing what you've implemented but, if you're able, I'd love to chat privately about what you've built! Sounds very cool. I'm FirstnameLastname@monzo.com if you're up for it 🙂

bradleyjkemp / sigma-go

Improved Comparator and Modifier Handling #21