falcosecurity / event-generator

Generate a variety of suspect actions that are detected by Falco rulesets
Apache License 2.0
92 stars 38 forks source link

fix: Enhance Falco syscall events triggering and reliability #220

Closed prezha closed 1 week ago

prezha commented 2 weeks ago

foreword/background: wanted to use falco event-generator for stability/regression testing, but hit some issues that this pr aims to fix i appreciate this is a bigger pr with lots of changes for i did not have a time to break it down into individual commits/PRs - the intention here is to offer the improvements i made and i'm, of course, happy to provide any additional clarification needed

changelog:

bonus:

testing performed

/kind bug /kind cleanup /kind tests

/area events

poiana commented 2 weeks ago

Welcome @prezha! It looks like this is your first PR to falcosecurity/event-generator 🎉

leogr commented 2 weeks ago

Hey @prezha

Thank you for this contribution! :star_struck: The review will take a bit of time due to the size of this PR, but I wanted to let you know that we will take a look.

:pray:

alacuku commented 2 weeks ago

Hey @prezha, thanks for the contribution! I will take a look in the coming days!

prezha commented 2 weeks ago

hey @leogr and @alacuku , you are most welcome and thank you both for considering the pr, i appreciate it!

prezha commented 2 weeks ago

hey @leogr and @alacuku, just a heads up: after some additional testing i noticed that the event-generator creates an increasing number of zombie processes (which can eventually create issues in the loop mode), so i amended the pr to prevent that

this is due to how we handle killing sleep process in PtraceAntiDebugAttempt() and PtraceAttachedToProcess() (where we need to also wait for the killed process to exit before moving on), as well as how we are limiting command execution time using the timeout (which i replaced now with the context.WithTimeout() and exec.CommandContext(), so they are handled properly)

leogr commented 2 weeks ago

Tried with Falco 0.38.3

$ sudo ./event-generator test syscall
INFO sleep for 100ms                               action=syscall.PacketSocketCreatedInContainer
WARN action skipped                                action=syscall.PacketSocketCreatedInContainer reason="only applicable to containers"
INFO sleep for 100ms                               action=syscall.CreateHardlinkOverSensitiveFiles
INFO action executed                               action=syscall.CreateHardlinkOverSensitiveFiles
INFO test passed                                   action=syscall.CreateHardlinkOverSensitiveFiles rule="Create Hardlink Over Sensitive Files" source=syscall
INFO sleep for 100ms                               action=syscall.NetcatRemoteCodeExecutionInContainer
WARN action skipped                                action=syscall.NetcatRemoteCodeExecutionInContainer reason="only applicable to containers"
INFO sleep for 100ms                               action=syscall.FindAwsCredentials
INFO action executed                               action=syscall.FindAwsCredentials
INFO test passed                                   action=syscall.FindAwsCredentials rule="Find AWS Credentials" source=syscall
INFO sleep for 100ms                               action=syscall.MountLaunchedInPrivilegedContainer
WARN action skipped                                action=syscall.MountLaunchedInPrivilegedContainer reason="only applicable to containers"
INFO sleep for 100ms                               action=syscall.CreateSymlinkOverSensitiveFiles
INFO action executed                               action=syscall.CreateSymlinkOverSensitiveFiles
INFO test passed                                   action=syscall.CreateSymlinkOverSensitiveFiles rule="Create Symlink Over Sensitive Files" source=syscall
INFO sleep for 100ms                               action=syscall.RemoveBulkDataFromDisk
INFO action executed                               action=syscall.RemoveBulkDataFromDisk
INFO test passed                                   action=syscall.RemoveBulkDataFromDisk rule="Remove Bulk Data from Disk" source=syscall
INFO sleep for 100ms                               action=syscall.DisallowedSSHConnectionNonStandardPort
ERRO action error                                  action=syscall.DisallowedSSHConnectionNonStandardPort error="context deadline exceeded"
INFO sleep for 100ms                               action=syscall.DropAndExecuteNewBinaryInContainer
WARN action skipped                                action=syscall.DropAndExecuteNewBinaryInContainer reason="only applicable to containers"
INFO sleep for 100ms                               action=syscall.RunShellUntrusted
INFO spawn as "httpd"                              action=syscall.RunShellUntrusted args="^helper.RunShell$"
INFO sleep for 100ms                               action=helper.RunShell as=httpd
INFO action executed                               action=helper.RunShell as=httpd
INFO test passed                                   action=syscall.RunShellUntrusted rule="Run shell untrusted" source=syscall
INFO sleep for 100ms                               action=syscall.LaunchSuspiciousNetworkToolOnHost
WARN action skipped                                action=syscall.LaunchSuspiciousNetworkToolOnHost reason="nmap executable file not found in $PATH"
INFO sleep for 100ms                               action=syscall.ClearLogActivities
INFO action executed                               action=syscall.ClearLogActivities
INFO test passed                                   action=syscall.ClearLogActivities rule="Clear Log Activities" source=syscall
INFO sleep for 100ms                               action=syscall.DebugfsLaunchedInPrivilegedContainer
WARN action skipped                                action=syscall.DebugfsLaunchedInPrivilegedContainer reason="only applicable to containers"
INFO sleep for 100ms                               action=syscall.ExecutionFromDevShm
INFO action executed                               action=syscall.ExecutionFromDevShm
INFO test passed                                   action=syscall.ExecutionFromDevShm rule="Execution from /dev/shm" source=syscall
INFO sleep for 100ms                               action=syscall.FilelessExecutionViaMemfdCreate
INFO action executed                               action=syscall.FilelessExecutionViaMemfdCreate
INFO test passed                                   action=syscall.FilelessExecutionViaMemfdCreate rule="Fileless execution via memfd_create" source=syscall
INFO sleep for 100ms                               action=syscall.ReadSensitiveFileUntrusted
INFO action executed                               action=syscall.ReadSensitiveFileUntrusted
INFO test passed                                   action=syscall.ReadSensitiveFileUntrusted rule="Read sensitive file untrusted" source=syscall
INFO sleep for 100ms                               action=syscall.JavaProcessClassFileDownload
INFO spawn as "java"                               action=syscall.JavaProcessClassFileDownload args="^helper.CombinedServerClient$"
INFO sleep for 100ms                               action=helper.CombinedServerClient
ERRO action error                                  action=syscall.JavaProcessClassFileDownload error="context deadline exceeded"
INFO sleep for 100ms                               action=syscall.DirectoryTraversalMonitoredFileRead
INFO action executed                               action=syscall.DirectoryTraversalMonitoredFileRead
INFO test passed                                   action=syscall.DirectoryTraversalMonitoredFileRead rule="Directory traversal monitored file read" source=syscall
INFO sleep for 100ms                               action=syscall.DetectReleaseAgentFileContainerEscapes
WARN action skipped                                action=syscall.DetectReleaseAgentFileContainerEscapes reason="only applicable to containers"
INFO sleep for 100ms                               action=syscall.PotentialLocalPrivilegeEscalationViaEnvironmentVariablesMisuse
INFO action executed                               action=syscall.PotentialLocalPrivilegeEscalationViaEnvironmentVariablesMisuse
ERRO action error                                  action=syscall.PotentialLocalPrivilegeEscalationViaEnvironmentVariablesMisuse error="context deadline exceeded"
INFO sleep for 100ms                               action=syscall.ReadSensitiveFileTrustedAfterStartup
INFO spawn as "httpd"                              action=syscall.ReadSensitiveFileTrustedAfterStartup args="^syscall.ReadSensitiveFileUntrusted$ --sleep 6s"
INFO sleep for 6s                                  action=syscall.ReadSensitiveFileUntrusted as=httpd
INFO action executed                               action=syscall.ReadSensitiveFileUntrusted as=httpd
INFO test passed                                   action=syscall.ReadSensitiveFileTrustedAfterStartup rule="Read sensitive file trusted after startup" source=syscall
INFO sleep for 100ms                               action=syscall.SystemUserInteractive
INFO run as "daemon"                               action=syscall.SystemUserInteractive cmdArgs="[]" cmdName=/bin/login user=daemon
INFO test passed                                   action=syscall.SystemUserInteractive rule="System user interactive" source=syscall
INFO sleep for 100ms                               action=syscall.PtraceAntiDebugAttempt
INFO action executed                               action=syscall.PtraceAntiDebugAttempt
INFO test passed                                   action=syscall.PtraceAntiDebugAttempt rule="PTRACE anti-debug attempt" source=syscall
INFO sleep for 100ms                               action=syscall.PtraceAttachedToProcess
INFO test passed                                   action=syscall.PtraceAttachedToProcess rule="PTRACE attached to process" source=syscall
INFO sleep for 100ms                               action=syscall.SearchPrivateKeysOrPasswords
INFO action executed                               action=syscall.SearchPrivateKeysOrPasswords
INFO test passed                                   action=syscall.SearchPrivateKeysOrPasswords rule="Search Private Keys or Passwords" source=syscall

I got just 3 failures:

Anything else worked as expected 🚀

prezha commented 2 weeks ago

thanks @leogr for the review! i'll fix the files names

as for the failures you got:

Tried with Falco 0.38.3

$ sudo ./event-generator test syscall
INFO sleep for 100ms                               action=syscall.PacketSocketCreatedInContainer
...

I got just 3 failures:

  • action=syscall.DisallowedSSHConnectionNonStandardPort error="context deadline exceeded"
  • action=syscall.JavaProcessClassFileDownload error="context deadline exceeded"
  • action=syscall.PotentialLocalPrivilegeEscalationViaEnvironmentVariablesMisuse error="context deadline exceeded" Any clue for these?

yep, i think it's because the default 100ms "sleep" sometimes is not sufficient to capture the event falco emits (and match it to the expected one in the test), so adding --sleep=1s worked for me reliably in my local tests (in minikube), can you please try with that

prezha commented 2 weeks ago

i've renamed all suggested files and few others for consistency

prezha commented 1 week ago

just realized that gh workflows also need updated go version, which i added in the last 501ed2b commit

leogr commented 1 week ago

yep, i think it's because the default 100ms "sleep" sometimes is not sufficient to capture the event falco emits (and match it to the expected one in the test), so adding --sleep=1s worked for me reliably in my local tests (in minikube), can you please try with that

Unfortunately, more --sleep did not work for me (I'm using lima on an Apple Silicon). I guess the sleep value can't be the root issue since it affects just the sleeping time between different actions. I've also tried using --test-timeout=2m, but they fail anyway.

That said, I believe we can dig into this issue in a follow-up PR. This is already great and big enough :sweat_smile:

poiana commented 1 week ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: leogr, prezha

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/falcosecurity/event-generator/blob/main/OWNERS)~~ [leogr] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
poiana commented 1 week ago

LGTM label has been added.

Git tree hash: 7df09df584861eb1b1c2442fb8f0874d6f958de3