falcosecurity / libs

libsinsp, libscap, the kernel module driver, and the eBPF driver sources
https://falcosecurity.github.io/libs/
Apache License 2.0
232 stars 165 forks source link

[REGRESSION] Modern bpf probe in least privileged mode #1157

Closed Andreagit97 closed 11 months ago

Andreagit97 commented 1 year ago

Describe the bug

After this PR https://github.com/falcosecurity/libs/pull/1062 we are no more able to run the modern bpf probe in least privileged mode and so using these capabilities:

* CAP_BPF
* CAP_PERFMON
* CAP_RESOURCE

The issue is that on non-COS systems the struct audit_task_info is not defined in the kernel vmlinux, so what libbpf does is to search this type into modules BTF but unfortunately this requires CAP_SYS_ADMIN https://github.com/torvalds/linux/blob/692b7dc87ca6d55ab254f8259e6f970171dc9d01/kernel/bpf/syscall.c#L3704

How to reproduce it

  1. Compile scap-open example
  2. Provide the right capabilities
    sudo setcap CAP_PERFMON,CAP_BPF,CAP_SYS_RESOURCE=+ep ./libscap/examples/01-open/scap-open
  3. Run it
    ./libscap/examples/01-open/scap-open --modern_bpf

Error

libbpf: failed to iterate BTF objects: -1
libbpf: prog 't1_execve_x': relo #791: target candidate search failed for [1238] struct audit_task_info: -1
libbpf: prog 't1_execve_x': relo #791: failed to relocate: -1
libbpf: failed to perform CO-RE relocations: -1
libbpf: failed to load object 'bpf_probe'
libbpf: failed to load BPF skeleton 'bpf_probe': -1
libpman: failed to load BPF object (errno: 1 | message: Operation not permitted)

If you provide the CAP_SYS_ADMIN capability all will work fine

Solution

We will have a libs patch release in the next few days and I would like to have this issue solved since it is causing some regressions, see here: https://github.com/falcosecurity/falco/issues/2487

Unfortunately, I don't see many solutions right now, the ideal one would be to disable this BTF module check in libbpf but it doesn't seem to be configurable :/

The only one seems to revert the PR and don't capture this info on COS, I don't like it but if we have to choose between having a working least privileged mode and the loginuid info on COS I would choose the first one since it is also a regression. Of course, I will try to find alternative solutions in the meanwhile but not sure about the outcome...WDYT? @erthalion @FedeDP @leogr

FedeDP commented 1 year ago

Hi Andrea! Nice catch; well, my idea is to keep the best of both worlds, by adding an cmake option that adds a compile definition that is then finally used to build in the COS related code there. Something like

option(ENABLE_COS_WORKAROUNDS, "COS workarounds for modern BPF probe. Disables least privileged mode.", false)

If not enabled (the default case Falco is run with), we won't capture that info on COS; but other consumers are still able to build libs with full COS support, if they wish.

Andreagit97 commented 1 year ago

Ei @FedeDP this seems a smart idea, of course, we should find a real solution maybe patching in someway libbpf or BTW asking the mailing list if this is the expected behavior, but since we want to fix the regression ASAP this solution could be enough for the moment!

FedeDP commented 1 year ago

patching in someway libbpf

We can do this too, but it would only affect bundled deps build (that btw is ok in any case); it is also pretty simple if you already have a working patch!

Andreagit97 commented 1 year ago

uhm as a first approach, I would go with your workaround... after having understood the reason why it works like that with kernel engineers we can also choose to apply a custom patch WDYT?

FedeDP commented 1 year ago

Agree!

Andreagit97 commented 1 year ago

I reported the issue in the mailing list https://lore.kernel.org/bpf/CAGQdkDvYU_e=_NX+6DRkL_-TeH3p+QtsdZwHkmH0w3Fuzw0C4w@mail.gmail.com/T/#u :)

jasondellaluce commented 1 year ago

@Andreagit97 can this be closed, or do you want it to stay open for a long-term solution?

Andreagit97 commented 1 year ago

I would keep this to track it and find a long-term solution

jasondellaluce commented 1 year ago

/milestone 0.12.0

FedeDP commented 1 year ago

I'd move this to /milestone libs-backlog

Since it won't be tackled soon (we are planning 0.12.0 for end of July).

poiana commented 1 year ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

Andreagit97 commented 1 year ago

/remove-lifecycle stale

leogr commented 1 year ago

What's the status of this? :thinking:

erthalion commented 1 year ago

From what I see the proposed change to libbpf was accepted [1], I guess that was the long-term solution discussed in the thread. If so, the issue could be closed.

Andreagit97 commented 1 year ago

yeah, we are just waiting for a stable tag of libbpf whit the fix. I would close it once we bump libbpf to a stable tag and we revert the corresponding workaround in libs https://github.com/falcosecurity/libs/pull/1160