Closed itaysk closed 2 years ago
This is also a good opportunity to move main.go
of tracee-ebpf to ./cmd/tracee-ebpf/main.go
as described in #1217
@mtcherni95 perhaps a good start is to review the existing code and discuss here which new packages you would like to create?
@mtcherni95 - I briefly looked into this, I think we'll need to add some tests in place before we can safely consider refactoring this part of the code. As of today, there is very little coverage for tracee.go
and main.go
.
My proposed plan of action is to break some of the hard to test pieces of the above mentioned files into logical units (functions) that are easier to test on their own. Furthermore, there is a sanity level of tests that exists today to verify some pieces https://github.com/aquasecurity/tracee/blob/main/tests/tracee_test.go - we definitely need more of such cases to verify all the business logic that will get refactored will be functionally correct.
Let us know what are your thoughts on this. Happy to discuss and chart a plan forward.
I identified these main semantic components concerning the tracee-ebpf engine, which from now on I will just be calling it "Tracer":
My plan was to create under pkg/tracer
a package for each of the previously stated items. The actual implementation of the engine would then be under pkg/tracer/tracer.go
Moreover, I believe would be convenient to create under pkg
the following packages:
The idea would be that eventually also tracee-rules could use the flags
and (if needed) the stats
packages.
Every single package will be then easily "unit-testable".
Of course, things might slightly change with the development, however I believe this is a pretty solid base line.
Let me know what you guys think.
where do you think makeBPFObject
and all of the accompanying code should reside (some parts of the makefile are re-implemented in Go)? I think a another package like makebpf
what are we doing with package external? and also I think the issue #1098 should also be included in this (moving the eventId tables to package external)
also, can you please elaborate on what is within each package from the list you shared (except Buckets Cache) which is self-describing?
also, can you please elaborate on what is within each package from the list you shared (except Buckets Cache) which is self-describing?
filters
package l though store the structs/interface associated - something like https://github.com/mtcherni95/tracee/blob/tracee-library/tracee-ebpf/engine/filters/filters.go (note that this is just an old branch, but gives the idea).config
package will implement the methods associated to the Config
struct, such as func (tc Config) Validate() error
- this reference should give an idea as well https://github.com/mtcherni95/tracee/blob/tracee-library/tracee-ebpf/engine/config/config.go
events
I thought of storing the Event
struct that is stored today under external
. However, after some more thinking, this could actually be under external/event
. flags
please see https://github.com/aquasecurity/tracee/pull/1236stats
- something like https://github.com/mtcherni95/tracee/blob/tracee-library/tracee-ebpf/engine/stats/counter.gomakebpf
package or bpfbuilder
which will implement all the helper functions as well such as checkclang()
and such.@mtcherni95 and I have spent some time to review the current code and make notes on packages. We didn't have time to review the entire code, so I'm pasting our incomplete notes. The goal is to create a list of actionable tasks that can be completed individually (Work on this cannot start before #1217 closes and we have the basic project structure).
main package: main.go contains these parts:
main
for now. but split in separate files: flags-filter.go
, flags-output.go
, flags-capture.go
- #1248 bpfbuilder
- #1239 capabilities
- printer.go - create new package eventprinter
. Factory function moves to main.go
tracee package:
bucketscache
- #1249stats
. external.Stats
will move to this package - #1250config
that includes filter, captureConfig ,outputConfig and related functions. (in the future maybe create an interface for config and implement validators)eventprinter
package (see above). remove dependency on tracee (can be a static function). replace parse
misnomer with print
(as the file name suggests).
- ebpf compilations - create new package
bpfbuilder
@yanivagman had some comments about removing the builder logic and letting users to use "builder container images" to have non-CORE built.
Here it is: https://github.com/aquasecurity/tracee/issues/1239
@mtcherni95 and I have spent some time to review the current code and make notes on packages. We didn't have time to review the entire code, so I'm pasting our incomplete notes. The goal is to create a list of actionable tasks that can be completed individually (Work on this cannot start before #1217 closes and we have the basic project structure).
main package: main.go contains these parts:
- cli interface - obviously stays untouched.
- flags handling - stay in package
main
for now. but split in separate files:flags-filter.go
,flags-output.go
,flags-capture.go
- ebpf compilations - create new package
bpfbuilder
- check capabilities utility function - create package
capabilities
The split up into different go source files seems reasonable to me but splitting up into a plethora of packages is something I'm trying to wrap my head around. I believe there is a use case to decouple the above mentioned point 3 and point 4 into their own respective packages because the logic they contain is re-used somewhere else? If so is the case, where are those call sites?
Not necessarily saying I disagree with this but would like to better understand how we can leverage splitting up into new packages the best.
- ebpf compilations - create new package
bpfbuilder
- check capabilities utility function - create package
capabilities
These two are the ones that can certainly benefit from the accept interfaces, return structs paradigm. This would help us test those cases were a real elevated system call is necessary to exercise certain behaviours of the function.
printer.go - create new package
eventprinter
. Factory function moves tomain.go
tracee package:
- need to eventually go away, broken down into packages
- external package will eventually go away, the structs will be moved to respective packages
+1. I've always seen external
as another utils
, which IMO is an anti-pattern as anything that gets put in there doesn't -really belong anywhere else.
I'll start to create issues for the obvious packages, and we can discuss the other ones in a call
The mindset we had in creating this proposal is that not only the project itself will use the previously stated new packages, but also a third party which is adopting Tracee as an engine and use it as a library. For instance, the capabilities
package could be a classic example for it: Tracee will provide the relevant exported function to check if the hypothetical application of a user has enough capabilities to load and run the Tracee BPF program.
Same thing could be said about the bpfbuilder
and such.
Continuing from https://github.com/aquasecurity/tracee/issues/960#issuecomment-992654757 we continued to review the tracee package.
2 high level recommendations:
tracing
instead of the current tracee
which will contain the core tracee-ebpf enginetracee
to tracer
@yanivagman already did most of the work in https://github.com/aquasecurity/tracee/pull/934, where he created files for the different pieces of functionality. there are just minor improvements to make:
containers
. The code is already independent of traceeevents_decoder.go
- rename to events_decoding.go
events_processor.go
- rename to event_processing.go
. CopyFileByPath
function should move to some utlis package or duplicated.events.go
- processEvents
should be minimal as possible, there is some code here that should be in processEvent
.filters.go
- create new package tracingfilters
(consider the usage of pointers vs values in Filter
)We still haven't looked into tarcee.go
and consts.go
which is probably the heavier work
2 high level recommendations:
create new package tracing instead of the current tracee which will contain the core tracee-ebpf engine rename the struct tracee to tracer
Keeping in mind that we will want to add enforcement capabilities in the future, I think we should wait with using a name such as tracing for the package and tracee for the struct. Having an additional "enforcing" package will have a lot in common with the suggested "tracing" package, and I'm not sure it will be the best choice to (try to) decouple the logic of these two. We can reconsider this after extracting the "obvious" packages out of tracee.go.
Rethinking about some of the packages suggested here, I agree with @simar7 that
The split up into different go source files seems reasonable to me but splitting up into a plethora of packages is something I'm trying to wrap my head around.
If not for code reuse, why do we want to create packages if we already have the logic split into different files?
Before creating any new package, I think we should ask ourselves - can this package be used in different places? Is the logic in the package self contained? Will it be enough to just keep it in a separate file? Only after we have a good reason for creating a new package, we should start with its implementation.
So please, anyone that takes one of the new opened issues for extracting a package - please explain first why you think we should create this package
Rethinking about some of the packages suggested here, I agree with @simar7 that
The split up into different go source files seems reasonable to me but splitting up into a plethora of packages is something I'm trying to wrap my head around.
If not for code reuse, why do we want to create packages if we already have the logic split into different files?
I don't see a good reason to have "filters" package, as it has no meaning outside of the "tracee" package - The filters are tightly coupled with the bpf maps defined and initialized in the tracee packge. Having the filters.go file is already a good separation of the logic, and can probably stay in the same package with other files.
Same claim for "containers"- how will we benefit from having it in a different package?
Before creating any new package, I think we should ask ourselves - can this package be used in different places? Is the logic in the package self contained? Will it be enough to just keep it in a separate file?
Only after we have a good reason for creating a new package, we should start with its implementation.
So please, anyone that takes one of the new opened issues for extracting a package - please explain first why you think we should create this package
Just to add from a call I had with @itaysk: We need to first decide if the "packages" we're talking about here will be "real packages" that is, placed under something like pkg
directory and importable externally.
Or if we just want them to be logical units of code within the Tracee project that is, available under something like internal
and not available to anyone outside. The Go compiler ensures that this is the case for all internal
packages.
IMO, once something is declared externally available once, a public contract is established and we must be careful in honouring that contract going forwards. So unless there's a strong need for such packages to be externally available and used (hence my question about callsites) it might be worth thinking about it again.
tracing for the package and tracee for the struct
it's tracing and tracer (not tracee)
why do we want to create packages if we already have the logic split into different files?
IMO - encapsulation (ignore the OOP-specific details, the concept is still relevant here). having all the code and all the functions and all the variables in one "namespace" is messy, and can lead to spaghetti code. If we can achieve the same goal with files, or with structs, or something else, that's fine too I think.
Another reason is to reduce the complexity surface exposed to external consumers, or to allow to consume specific components when needed instead of having to import one huge namespace of objects.
tracing for the package and tracee for the struct
it's tracing and tracer (not tracee)
Yes, just a typo of mine
why do we want to create packages if we already have the logic split into different files?
IMO - encapsulation (ignore the OOP-specific details, the concept is still relevant here). having all the code and all the functions and all the variables in one "namespace" is messy, and can lead to spaghetti code. If we can achieve the same goal with files, or with structs, or something else, that's fine too I think.
Another reason is to reduce the complexity surface exposed to external consumers, or to allow to consume specific components when needed instead of having to import one huge namespace of objects.
Do we want "filters" and "containers" packages to be exposed to external consumers? Or is it just the configuration of those (for which I agree we should have a config package)
I don't know and I agree with what's been said that it should be the decided/discussed when actually working on the issue. it's ok to close the issue with just moving the code into a file in the same package. but just to clarify, package doesn't necessarily mean external. it might make sense to package some code in a package (and as simar suggests make it internal)
I was having issues compiling tracee using the containers with the Makefile.
While running the make -f builder/Makefile.tracee-make alpine-make ARG="clean all"
I receive an error:
the input device is not a TTY
I believe that the issue is related to the Makefile refactor.
today, most of tracee-ebpf is two huge packages: main and tracee. split tracee-ebpf's Go code into multiple packages. each package is self-contained and testable. think hard about the names and APIs of those packages.
Some ideas:
tracing
instead of the currenttracee
which will contain the core tracee-ebpf enginetracee
totracer
main package:
main
for now. but extract frommain.go
- #1248capabilities
-eventprinter
. Factory function moves tomain.go
- #1266tracee package:
bucketscache
- #1249stats
.external.Stats
will move to this package - #1250config
that includes filter, captureConfig ,outputConfig and related functions. (also create an interface for config and implement validators) - #1264argprinters.go
- move toeventprinter
package (see above). remove dependency on tracee (can be a static function). replaceparse
misnomer withprint
(as the file name suggests). - #1265containers
. The code is already independent of tracee -events
- #1268events_decoder.go
- create new packageevents
. rename toevents_decoding.go
events_processor.go
- create new packageevents
. rename toevent_processing.go
.CopyFileByPath
function should move to some utlis package or duplicated.events.go
- create new packageevents
processEventsshould be minimal as possible, there is some code here that should be in
processEvent`.filters.go
- create new packagetracingfilters
(consider the usage of pointers vs values inFilter
) - #1269TBD -
tarcee.go
andconsts.go
related: #934, #789, #1131, #1098