falcosecurity / libs

libsinsp, libscap, the kernel module driver, and the eBPF driver sources
https://falcosecurity.github.io/libs/
Apache License 2.0
216 stars 160 forks source link

sinsp-example inconsistent results #794

Closed rcohencyberarmor closed 1 year ago

rcohencyberarmor commented 1 year ago

Hi All,

I'm using the sinsp-example for monitor all syscalls of simple redis container running on k8s (on minikube) and I get inconsistent results. sometimes some syscalls are missing and sometimes all of the syscalls are found , I have compare the sinsp-example result to strace results. I have saw that the syscalls that are missing is called in the redis processes not so often(syscalls like uname, lseek, etc)

  1. get list of all syscalls of redis using strace( strace -f -c docker run -it redis &> strace-redis-result-file)
  2. compile sinsp-example, and bpf driver
  3. run minikube
  4. run the sinsp-exampe with the following command: sudo ./build/libsinsp/examples/sinsp-example -a -b ./build/driver/bpf/probe.o &> to-some-file
  5. create simple redis on k8s: kubectl create deploy --image=redis redis and wait for 60 secs
  6. filter result in to-some-file with the redis container ID
  7. compare filter results with strace results
  8. return to 4 stage for let's say 10 times

stage #6 need to be successful for all 10 times

image

the list of the syscalls between the attempts are thye missing one

kernel version: 5.15.0-48-generic linux distribuion: ~20.04.1-Ubuntu on the latest falco libs

slashben commented 1 year ago

I suspect that the ring buffer overruns here, but it is not clear how should I manage the libs to avoid this.

Andreagit97 commented 1 year ago

@slashben @rcohencyberarmor, looking at the issue it seems like in some runs you faced some event drops, buffers are full and you miss some events. If you call the example like this,

 sudo ./build/libsinsp/examples/sinsp-example -a -b ./build/driver/bpf/probe.o &> to-some-file

it is very likely that it will drop since it is printing every single event it sees into your file. Printing something in userspace consumes a lot of userspace time so no one is reading the buffers and they will become full in short time.

So first of all probably you have to use a filter -f proc.name=<redis-whatever> and if this is not enough you can configure bigger buffers with the -d option, something like -d 16777216 will create buffers of 16 MB (16777216 is the dimension is in bytes). Default buffer dimension is 8 MB

The libraries offer some other features like the simple-consumer approach that allows you to ignore not interesting syscalls for your use case, but they are not implemented in this example, as the name said this is just an example of how to use the libraries, I would avoid to use it in real scenarios or in production.

Hoping this could help you, let me know if you have other doubts :)

FedeDP commented 1 year ago

The libraries offer some other features like the simple-consumer approach that allows you to ignore not interesting syscalls for your use case, but they are not implemented in this example, as the name said this is just an example of how to use the libraries, I would avoid to use it in real scenarios or in production.

I fully agree with Andrea here; we cannot declare an example as production ready

poiana commented 1 year ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

poiana commented 1 year ago

Stale issues rot after 30d of inactivity.

Mark the issue as fresh with /remove-lifecycle rotten.

Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle rotten

poiana commented 1 year ago

Rotten issues close after 30d of inactivity.

Reopen the issue with /reopen.

Mark the issue as fresh with /remove-lifecycle rotten.

Provide feedback via https://github.com/falcosecurity/community. /close

poiana commented 1 year ago

@poiana: Closing this issue.

In response to [this](https://github.com/falcosecurity/libs/issues/794#issuecomment-1553886862): >Rotten issues close after 30d of inactivity. > >Reopen the issue with `/reopen`. > >Mark the issue as fresh with `/remove-lifecycle rotten`. > >Provide feedback via https://github.com/falcosecurity/community. >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.