markuskont / go-sigma-rule-engine

Golang library that implements a sigma log rule parser and match engine.
Apache License 2.0
92 stars 17 forks source link

Investigate go 1.18 generics to clean up type switches #12

Open markuskont opened 2 years ago

markuskont commented 2 years ago

With go 1.18 introducing generics, we should investigate if they could be used to clean up some type switches that were needed to deal with arbitrary types defined in Sigma rules and potential type mismatches when dealing with actual logs.

See: https://github.com/markuskont/go-sigma-rule-engine/blob/master/pkg/sigma/v2/ident.go#L150

Also, Selection object currently defines separate matchers for numeric and string types, which was also a hack around the Go type system colliding with arbitrary rule content.

https://github.com/markuskont/go-sigma-rule-engine/blob/master/pkg/sigma/v2/ident.go#L142

Perhaps the latter could also be cleaned up with new generics for a cleaner implementation. While the latter could also be handled with regular interfaces, I'd rather maintain compile time type safety there.

Marking issue as enhancement / question as I have not personally used the new generics yet and am thus unsure if using them would actually be an improvement. Some research is needed.

newodahs commented 2 years ago

I may take a look at this on Sunday as I've been wanting to dig into the new generics stuff anyway.

I will say the general performance of type assertions in golang is really good, so there is something to be weighed with any performance hit we take vs code readability (though I think the performance hit would have to be pretty egregious as it looks like the generics may clean this all up a bit)

markuskont commented 2 years ago

Yea, nothing against type assertions. The idea was to just clean up some nasty code. The first reference was a particular sore point as it's a lot of repeat code. Dealing with mismatched types from sigma rules and rule messages was particularly painful when first working on this project.

Each case basically does the same thing, yet types need to be checked separately. I think a single generic function with number constraint could basically do the same thing. I think that's exactly the use-case Go generics were made for. Regarding performance, I believe I saw a perf test somewhere with two approaches basically on par. A benchmark or few to validate it would not hurt though.

Second reference is more tricky. I'm not sure if the generics implementation would actually provide a meaningful improvement, or make sense for the use-case at all. I guess a generic Matcher (bottom level, one that actually checks a pattern against input) could be doable, but not sure if it's actually worth it. Or if they are even properly applicable.

newodahs commented 2 years ago

Early results follow, some raw thoughts:

So, looking at this a bit deeper, one of the issues we run into right away is that our values coming in from Select are all of type interface{}, which means we'll still need a type assertion before the call to a generic function, so I don't see getting a reduction/simplification unless we did some major refactoring I think.

markuskont commented 2 years ago

Ah, yes. Indeed. I don't mind a bigger refactor to be honest if it meant bringing in more type safety. But since the engine has to deal with arbitrary input from log messages then interface{} is not really avoidable. We could use any now which does result in much cleaner code, imo. But I don't want to bump the Go version requirement only for that.

Since some importing projects might not yet be bumped to go 1.18, then I don't want to update the required version too soon anyway. No rush.

Let's keep the discussion going here for any fundamental refactor ideas. I mean, it was always a research project anyway, so I'm curious on what can be done. Haven't had any time to test out my proposed ideas myself yet though. Might be able to tinker on them soon.

markuskont commented 2 years ago

I actually just realized that moving the lib to go 1.18 has potentially a big benefit - native fuzz testing.