falcosecurity / libs

libsinsp, libscap, the kernel module driver, and the eBPF driver sources
https://falcosecurity.github.io/libs/
Apache License 2.0
212 stars 159 forks source link

Investigate `setproctitle` drivers behavior #988

Open FedeDP opened 1 year ago

FedeDP commented 1 year ago

Describe the bug

Some days ago we faced this strange thing in the PR regarding e2e tests (https://github.com/falcosecurity/libs/pull/967#discussion_r1133483599). The question was, why the modern probe is taking something different respect than the other 2 drivers? the answer is pretty funny.

nginx under the hood calls the setproctitle method, what this method does is quite strange: it moves the actual args of the process from mm->arg_start to mm->arg_env, so into the environment variables space, and overwrite the content of mm->arg_start with the "process title". In our case the modern probe prints:

nginx: master process nginx -g daemon off;

this is because it tries to read a string until the first \0 is faced, in this case since args memory and env memory are contiguous we are reading both the process title (nginx: master process) and the exe+args (nginx -g daemon off). The other 2 drivers try to read only the args memory and for this reason, we read just the process title nginx: master process instead of real exe and args.

This patch tries to solve this issue in all three drivers but we have some concerns: in the old probe this part of the code is probably the most fragile, so I had to rewrite it :( it is still too complex for some kernels like 4.14 but I can simplify it a little bit! Btw this is always the same topic we have here

The actual patch takes inspiration from there https://elixir.bootlin.com/linux/latest/source/mm/util.c#L965 (this is a kernel helper that tries to address the issue of setproctitle function). What this function does is check if there is a null terminator at the end of the args memory, if no, it considers this area modified by setproctitle and checks for the real args into the env memory. BTW it could happen that for some reason args memory misses the final terminator, this would mean that we read the env memory even if setproctitle was not called and so we read junk...take a look at the forkX_father_setproctitle test in drivers_tests

How to reproduce it

A simple test is attached.

Running sudo ./libsinsp/examples/sinsp-example -m -f "proc.name=bypass and evt.type=clone3" -o "%proc.exe %proc.name %proc.args %proc.pid"

and then running bypass in the background, will print:

{"proc.args":"","proc.exe":"./bypass","proc.name":"bypass","proc.pid":77849}
{"proc.args":"","proc.exe":"bypass: Topkek","proc.name":"bypass","proc.pid":77850} # child process sees new title
{"proc.args":"","proc.exe":"./bypass","proc.name":"bypass","proc.pid":77849}

Expected behaviour

Correct proc.exe should still be printed.

bypass.txt

FedeDP commented 1 year ago

/milestone 0.11.0

FedeDP commented 1 year ago

/assign

FedeDP commented 1 year ago

/assign @Andreagit97

FedeDP commented 1 year ago

/milestone 0.12.0

FedeDP commented 1 year ago

/milestone 0.13.0

poiana commented 7 months ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

Andreagit97 commented 7 months ago

/remove-lifecycle stale

poiana commented 4 months ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

Andreagit97 commented 4 months ago

/remove-lifecycle stale

poiana commented 1 month ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

Andreagit97 commented 1 month ago

/remove-lifecycle stale