falcosecurity / libs

libsinsp, libscap, the kernel module driver, and the eBPF driver sources
https://falcosecurity.github.io/libs/
Apache License 2.0
227 stars 162 forks source link

sys_procexit_e has more then 1M instruction on GKE (COS) #1639

Closed albe19029 closed 8 months ago

albe19029 commented 8 months ago

Good day. Have found next issue starting from sysdig 0.33.1 version. When I try to run sysdig on GKE cluster I get next error:

processed 40396 insns (limit 1000000) max_states_per_insn 1 total_states 4057 peak_states 4057 mark_read 73 -- END PROG LOAD LOG -- libscap: bpf_load_program() event=raw_tracepoint/filler/sys_procexit_e: Operation not permitted

I found our that this commit leads to the problem. https://github.com/falcosecurity/libs/commit/1e06bd3f4f8bb9244caf4e33d5d110c482d88ee5

I don't know why this leads only to problems on COS (also checked AWS and Azure - no problems running there), but only managed to run sysdig if I change values of MAX_THREADS_GROUPS and MAX_HIERARCHY_TRAVERSE to this one:

define MAX_THREADS_GROUPS 25

define MAX_HIERARCHY_TRAVERSE 35

Is it possible to investigate why on COS this limit is lower then on other Linux Distros. Or adopt values to make it possible to run sysdig starting from 0.33.1 also on GKE, as now it is broken.

Have tested values both on x64 and arm64 clusters. Be aware, that to run sysdig on arm64 this should be fixed also (https://github.com/draios/sysdig/issues/2057 - this was original ticket with the problem)

To run sysdig on GKE I use next yaml file scap.txt

kubectl apply -f scap.yaml

And then attach to pod: kubectl exec --stdin --tty sysdig-0341 -- /bin/bash

And run sysdig

Andreagit97 commented 8 months ago

thank you for reporting! uhm this doesn't seem an issue with the number of instruction:

processed 40396 insns (limit 1000000) max_states_per_insn 1 total_states 4057 peak_states 4057 mark_read 73

as you can see we didn't overcome the limit of 1000000. BTW you are not the first one who reported us this kind of issue on GKE ... as a fix, I think that we can decrease the 2 macros (MAX_THREADS_GROUPS MAX_HIERARCHY_TRAVERSE) as you suggested!

Andreagit97 commented 8 months ago

BTW please note that this won't be fixed in sysdig 0.33.1. it will be fixed only in sysdig versions that will be based on the new patched libs versions

albe19029 commented 8 months ago

Thanks for fix, will be wait for next sysdig release. We are migrating to version 0.34.1 now, so no problems. Hope new release with this fix will coming soon, as without it - GKE users won't be happy by this update.