falcosecurity / falco

Cloud Native Runtime Security
https://falco.org
Apache License 2.0
7.16k stars 884 forks source link

[New Feature]: User flag for `base_syscalls.repair` as extension to `base_syscalls` config overriding default sinsp state enforcement #2433

Closed incertum closed 1 year ago

incertum commented 1 year ago

Context / Bigger Picture

Thanks to @Happy-Dude feedback realized to perhaps first set the stage, please consider reading https://github.com/falcosecurity/falco/issues/2433#issuecomment-1447620920 before continuing reading below as we are diving right into the technical details of what we currently call "Option 3".

Motivation

At the time of writing Falco's default libsinsp state enforcement activates a static set of system calls [see additional note 1] that are associated with the EF_MODIFIES_STATE flag in the libs event_table. The entirety of these system calls are activated regardless of the system calls used in Falco rules.

Compute overhead budgets typically are not unlimited plus Falco's rules event evaluation runs single threaded and eventually drops events kernel side once one entire CPU is consumed. Therefore, the default libsinsp state enforcement can have severe production degradation impacts, especially for configurations that do not activate most system calls and/or when running Falco on very active servers w/ 96 CPUs (see https://github.com/falcosecurity/libs/issues/855).

We introduced base_syscalls (see https://github.com/falcosecurity/falco/issues/2373) (scheduled for 0.35 release, see https://github.com/falcosecurity/falco/commit/d6421d4e67fd4a058ccb3b448291767f6b5a6992) which provides the end user with a config option to pass their own set of syscalls that will be activated in addition to the system calls from each Falco rule. The recommendations given for this new option are based on the concept of adaptive_base_syscalls introduced here https://github.com/falcosecurity/libs/pull/826.

The question of what the system calls are that are truly needed for Falco's state engine given the rules sets remains open. Plus ideally there would also be such an automated adaptive_base_syscalls in addition to base_syscalls so that end users do not have to keep a manual set of base syscalls correct and can rely on automatic resourceful syscalls activation that ensures that both (1) data is correct and (2) state engine remains healthy.

The intended outcome of this issue is to derive a clear understanding of what the system call dependencies of libsinsp are given x input filter system calls.

Notes:

incertum commented 1 year ago

Describing current approach of https://github.com/falcosecurity/libs/pull/826

Regardless of the input filter system calls a fair approach seems to activate all system calls that are explicitly used to build the core sinsp state engine, that is, the process or thread cache table. Code review of parsers surfaced that all below system calls can add or alter the table plus these are typically more low volume system calls comparatively.

libsinsp::events::set<ppm_sc_code> adaptive_sinsp_state_sc_set = {
        PPM_SC_CLONE,
        PPM_SC_CLONE3,
        PPM_SC_FORK,
        PPM_SC_VFORK,
        PPM_SC_EXECVE,
        PPM_SC_EXECVEAT,
        PPM_SC_FCHDIR,
        PPM_SC_CHDIR,
        PPM_SC_CHROOT,
        PPM_SC_CAPSET,
        PPM_SC_SETGID,
        PPM_SC_SETGID32,
        PPM_SC_SETPGID,
        PPM_SC_SETRESGID,
        PPM_SC_SETRESGID32,
        PPM_SC_SETRESUID,
        PPM_SC_SETRESUID32,
        PPM_SC_SETSID,
        PPM_SC_SETUID,
    };

Now the conditioning by input filter syscalls starts:

Lastly, it is worth mentioning that Falco also hooks into the scheduler process exit and the associated procexit event takes care of removing threads / processes from the state engine again -> Thereforce, so this is not dependent on any system call.

incertum commented 1 year ago

In aboves approach https://github.com/falcosecurity/libs/pull/826 what else are we missing?

Below is a list of the static set of system calls that are associated with the EF_MODIFIES_STATE flag in the libs event_table that the new adaptive approach does NOT consider a libsinsp state dependency.

We could all collectively review the parser? Would appreciate help from the community. Thanks a bunch in advance!

PPM_SC_ACCEPT,
PPM_SC_ACCEPT4,
PPM_SC_CONNECT,
PPM_SC_CREAT,
PPM_SC_DUP,
PPM_SC_DUP2,
PPM_SC_DUP3,
PPM_SC_EVENTFD,
PPM_SC_EVENTFD2,
PPM_SC_FCNTL,
PPM_SC_FCNTL64,
PPM_SC_FORK,
PPM_SC_INOTIFY_INIT,
PPM_SC_INOTIFY_INIT1,
PPM_SC_IO_URING_SETUP,
PPM_SC_MOUNT,
PPM_SC_OPEN,
PPM_SC_OPEN_BY_HANDLE_AT,
PPM_SC_OPENAT,
PPM_SC_OPENAT2,
PPM_SC_PIPE,
PPM_SC_PIPE2,
PPM_SC_PRLIMIT64,
PPM_SC_RECVFROM,
PPM_SC_RECVMSG,
PPM_SC_SENDMSG,
PPM_SC_SENDTO,
PPM_SC_SETRLIMIT,
PPM_SC_SHUTDOWN,
PPM_SC_SIGNALFD,
PPM_SC_SIGNALFD4,
PPM_SC_SOCKETPAIR,
PPM_SC_TIMERFD_CREATE,
PPM_SC_UMOUNT2,
PPM_SC_USERFAULTFD,
PPM_SC_VFORK,
PPM_SC_EPOLL_CREATE,
PPM_SC_EPOLL_CREATE1,
happy-dude commented 1 year ago

Hey @incertum -- awesome work and excited to see the progress!

There's some new terminology being used here that I'm not exactly caught up with.

From reading the related links in the description,

Can I get a bit of clarification regarding:

incertum commented 1 year ago

Hey @incertum -- awesome work and excited to see the progress!

There's some new terminology being used here that I'm not exactly caught up with.

Absolutely, trying my best to help clarify and please feel free to ask more follow up questions. This is exactly the type of feedback we need to improve this feature! Thanks a bunch!

Re the wording open to change adaptive to something else ...

From reading the related links in the description,

  • base_syscalls allows the user to pass on their own set of syscalls to activate

It allows the end user to tell Falco to also activate system calls that are not defined in any Falco rule. The end result will be all system calls from each Falco rules plus the ones passed over base_syscalls. This puts you 100% in control to tell Falco what to trace in the kernel.

  • new default (per [new(libsinsp): feature adaptive base syscalls option in ppm sc API libs#826 (comment)]

adaptive base syscalls option would be opt-in, the current default is adding a hard-coded static set of system calls to the system calls from each Falco rule. So no matter how you change Falco rules, this static set will always be activated in the kernel.

  • -A flag will enable heavy syscalls (still adaptive)

-A flag is unrelated and was introduced as a safeguard to remove I/O system calls unless you explicitly allow them, perhaps the latest related Falco PR helps clarifying this feature more? See @jasondellaluce write up here

Can I get a bit of clarification regarding:

  • what adaptive means in this context?
  • why being "sinsp state compliant" is important?
  • how this yields performance improvements? (a before/after comparison of how Falco is tracing syscalls would probably help me here.)

Please let me try a fresh explanation in a follow up comment for all these.

incertum commented 1 year ago

What does Falco <= 0.34.x do?

Falco <= 0.34.x does not offer much configurability wrt to system calls. The work Falco does in the kernel is either a static set of hard-coded system calls or all system calls (old -A flag behavior) regardless of Falco rules.

What is new in Falco >= 0.35?

Falco now gives end users control over how much work Falco does in the kernel. This can directly influence CPU utilization and possible kernel side event drops on servers with a lot of activity. In summary, you can control Falco's resource utilization in ways you could not do before.


With Falco >= 0.35 there are several options and the end user can choose the option that best fits the use case at hand.

Option 1: The default option -> use if you do not need Option 2 or Option 3

Option 2: base_syscalls user config (set) -> use with caution

Option 3: adaptive_base_syscalls user config (bool) -> use if you want Falco to be more resourceful and smarter in an automated manner aka the Falco "only takes what you can eat" option (what this ticket is about)

What is Falco's state engine:

Falco creates a struct in memory to cache all processes so that you can for example look up parent processes. Internally Falco calls this the thread table and there are more details to it. Some system calls can modify the information of a process in that struct, e.g. setuid. Then there are other cases more related to file descriptors. For instance, network system calls like connect do not know about ips on their own, the place to retrieve ip tuples is the socket system calls. Furthermore, the process table in memory also keeps track of fds, so you would want to monitor the close system call as well so that the state engine knows to remove the fd from memory ...

Scenarios:

happy-dude commented 1 year ago

Regarding a name suggestions for adaptive_base_syscalls, I was wondering if there was something that can convey that it is related to building the Falco sinsp state engine 🤔 ?

Maybe sinsp_state_syscalls? sinspful_syscalls ?

incertum commented 1 year ago

Plus we would need to convey that this is a boolean option that reasons automatically and as mentioned above follows the principle of "only take what you can eat", which also aligns with https://github.com/cncf/tag-env-sustainability (tag environmental sustainability). So it could be regarded as the best of Option 1 and Option 2 sort of ... also still thinking about a name that conveys all this better. Will post again once I have ideas.

Since I mention tag-env-sustainability, perhaps it would be of interest to tag up with them on the tough challenges we also face with the single thread issues and the security monitoring degradation implications Ali pointed out? But if so let's quickly pivot to a new issue for that.

jasondellaluce commented 1 year ago

Since I mention tag-env-sustainability, perhaps it would be of interest to tag up with them on the tough challenges we also face with the single thread issues and the security monitoring degradation implications Ali pointed out? But if so let's quickly pivot to a new issue for that.

I think this would be very cool! I love their initiative and I think our case would be a good candidate for tagging up with them. @incertum, would you like to drive this?

jasondellaluce commented 1 year ago

Regarding a name suggestions for adaptive_base_syscalls, I was wondering if there was something that can convey that it is related to building the Falco sinsp state engine 🤔 ?

My proposal would be to call the flag base_syscalls_repair, because effectively we interact with libsinsp to “fix-up” the custom set making sure it contains some necessary syscalls and prevent mistakes. Alternatively (if you like), we could rework the Falco config field (since we haven’t released it yet), to be:

base_syscalls:
  repair: true
  custom_set: [...]
incertum commented 1 year ago

My proposal would be to call the flag base_syscalls_repair, because effectively we interact with libsinsp to “fix-up” the custom set making sure it contains some necessary syscalls and prevent mistakes. Alternatively (if you like), we could rework the Falco config field (since we haven’t released it yet), to be:

base_syscalls:
  repair: true
  custom_set: [...]

I like this a lot, because it also adds the option to actually fix up the custom_set which I hadn't thought of. Furthermore, if we do something when repair is true, but custom_set is empty we would effectively get the what I just keep calling "Option 3" here in this ticket.

Under the hood in libs I can then rename things accordingly as well.

@Happy-Dude what are your thoughts?

happy-dude commented 1 year ago

@Happy-Dude what are your thoughts?

I'm a fan of this suggestion! Makes the configuration cleaner/clearer.

By doing this, I think it also affords itself to less misconfiguration by users using the single setting (as opposed to the original 2 config options/variables).

incertum commented 1 year ago

v1 of this feature has been finalized.