falcosecurity / libs

libsinsp, libscap, the kernel module driver, and the eBPF driver sources
https://falcosecurity.github.io/libs/
Apache License 2.0
227 stars 162 forks source link

BPF verifier does not permit use of bpf_probe_read() / bpf_probe_read_str() functions for architectures with overlapping address spaces #497

Closed hbrueckner closed 1 year ago

hbrueckner commented 2 years ago

Describe the bug

The BPF driver cannot be loaded because BPF verification fails with "unknown func bpf_probe_read_str". This problem occurs on architectures (e.g. s390x) that have distinct address spaces for kernel and user space where address ranges can overlap/being valid in both address spaces. In such cases, the BPF verifier does not permit use of bpf_probe_read{_str}() functions. Instead, using the kernel and user variants is required. The respective kernel change was introduced with commit "bpf: Restrict bpf_probe_read{, str}() only to archs where they work" https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0ebeea8ca8a4d1d453ad299aef0507dab04f6e8d

See below for additional detals and scap-open output.

How to reproduce it

To reproduce this issue on architectures with non-overlapping address ranges, rebuild the Linux kernel and turn off the config option CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE.

  1. git clone the libs repository
  2. Follow the generic builds steps and also build the BPF driver: cmake -DUSE_BUNDLED_DEPS=true -DBUILD_BPF=true ../
  3. Build the scap-open test program
  4. Run the the scap-open test program as follows: # ./libscap/examples/01-open/scap-open --bpf driver/bpf/probe.o

Expected behaviour

The expected behavior is BPF program can be successfully verified and loaded. The  BFP program should not use functions that let the BPF (tracepoint) verifier fail.

Screenshots and terminal output

# ./libscap/examples/01-open/scap-open --bpf driver/bpf/probe.o

---------------------- SCAP SOURCE ----------------------
* BPF probe: 'driver/bpf/probe.o'
-----------------------------------------------------------

---------------------- CONFIGURATIONS ----------------------
* Simple consumer mode: 0 (`1` means enabled).
* Print single event type: -1 (`-1` means no event to print).
* Run until '18446744073709551615' events are catched.
--------------------------------------------------------------

-- BEGIN PROG LOAD LOG --
0: R1=ctx(off=0,imm=0) R10=fp0
[...]
188: (85) call bpf_probe_read#4
unknown func bpf_probe_read#4
processed 180 insns (limit 1000000) max_states_per_insn 0 total_states 6 peak_states 6 mark_read 6

-- END PROG LOAD LOG --
libscap: bpf_load_program() err=22 event=filler/sys_single (1)

Environment

Additional context The conversion of the bpf_probe_read() and bpf_probe_read_str() to their kernel and user space variants can improve security by explicitly referencing the correct address space.

hbrueckner commented 2 years ago

cc: @iii-i and @kcrane

Andreagit97 commented 2 years ago

Hey, @hbrueckner thank you for reporting this! I know there are these kinds of problems in architectures like s390x this is why right now we are not officially supporting them :( Unfortunately, these architectures have also other problems like they cannot catch clone,fork,vfork child exit events... I have tried to solve this second issue in our kernel module, but for what concerns the bpf side, it will require a lot of extra work since there is also the first issue that you mentioned. Right now I'm quite busy, but I will try to solve this ASAP, hoping this won't require me to change a lot of stuff BPF side :/

Andreagit97 commented 2 years ago

Uhm I have seen here that the bpf_probe_read_user and bpf_probe_read_kernel helpers are defined only in kernels >= 5.5, so, unfortunately, we cannot fix all the previous kernel versions :/

poiana commented 1 year ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

hbrueckner commented 1 year ago

/remove-lifecycle stale

@Andreagit97 I already made some progress to resolve this issue. I keep you posted when I have my branch ready for a PR.

poiana commented 1 year ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

hbrueckner commented 1 year ago

/remove-lifecycle stale