foniod / redbpf

Rust library for building and running BPF/eBPF modules
Apache License 2.0
1.71k stars 136 forks source link

Loader.rs error "Permission denied" #300

Closed sebastiaoamaro closed 2 years ago

sebastiaoamaro commented 2 years ago

Hi, again I am running a Kubernetes deployment with multiple pods (17), where one of them injects rust programs in the rest of them, these rust programs run eBPF stuff specifically a part where they load an eBPF program, however, 6 of them load the eBPF program without an issue and the rest of them give the following error:

"thread 'async-std/runtime' panicked at 'called Result::unwrap() on an Err value: IO(Os { code: 1, kind: PermissionDenied, message: "Operation not permitted" })', redbpf/redbpf/src/load/loader.rs:51:67"

After looking into the code I saw this was the line: let map = PerfMap::bind(m, -1, *cpuid, 16, -1, 0).unwrap();

Bind is described here: https://github.com/foniod/redbpf/blob/c7f4b6811a2527ed033ec42aa34fe1d8cb4d97d4/redbpf/src/perf.rs#L274 The map section of the eBPF program (same for all Pods) loaded looks like this:

#[map(link_section = "maps")]
static mut perf_events: PerfMap<message> = PerfMap::with_max_entries(1024);
#[map(link_section = "maps")]
static mut usage: HashMap<u32, u32> = HashMap::with_max_entries(4096);

#[map(link_section = "maps")]
static mut time: HashMap<u32, u64> = HashMap::with_max_entries(4096);

As a side note, this works completely fine in Docker, working in Docker, and the fact that 6 of them work makes this extremely confusing for me. Does anyone have any idea why this might happen or a way to continue debugging? Thanks in advance.

rsdy commented 2 years ago

Is there any way you could try this one box on redbpf master? We're recently changed these code paths but so far it's unreleased, it would be really interesting how this performs. I can also make a -pre release if that makes it easier.

edit: another thing I can think about is different kernel versions on the kube nodes, maybe?

sebastiaoamaro commented 2 years ago

Hi thanks for the reply! I updated my fork to the most recent version in redpbf/main however I get these 2 errors when building:

error[E0308]: mismatched types
   --> redbpf/bpf-sys/src/type_gen.rs:199:17
    |
199 |                 Some(vdprintf_wrapper),
    |                 ^^^^^^^^^^^^^^^^^^^^^^ expected *-ptr, found enum `Option`
    |
    = note: expected raw pointer `*const libbpf_bindings::btf_ext`
                      found enum `Option<unsafe extern "C" fn(*mut c_void, *const i8, *mut libbpf_bindings::__va_list_tag) {vdprintf_wrapper}>`

error[E0308]: mismatched types
   --> redbpf/bpf-sys/src/type_gen.rs:201:17
    |
201 |                 ptr::null(),
    |                 ^^^^^^^^^^^ expected enum `Option`, found *-ptr
    |
    = note:     expected enum `Option<unsafe extern "C" fn(*mut c_void, *const i8, *mut libbpf_bindings::__va_list_tag)>`
            found raw pointer `*const _`

The rust binaries run in the namespace of a single container that has the 5.13.0-35-generic kernel.

sebastiaoamaro commented 2 years ago

I am using ubuntu 21.10 however the same happens in ubuntu 20.04 (this was a fresh installation following the steps on redbpf readme.md).

rsdy commented 2 years ago

This is weird, because CI passes on these targets. I think the issue is with libbpf-sys migration on master, that's unlikely will get fixed until upstream becomes more responsive.

sebastiaoamaro commented 2 years ago

So do I have to wait until the next release to get the changes to the code paths mentioned here https://github.com/foniod/redbpf/issues/300#issuecomment-1064989126 ?

rhdxmr commented 2 years ago

Hi @sebastiaoamaro

Hi thanks for the reply! I updated my fork to the most recent version in redpbf/main however I get these 2 errors when building:

error[E0308]: mismatched types
   --> redbpf/bpf-sys/src/type_gen.rs:199:17
    |
199 |                 Some(vdprintf_wrapper),
    |                 ^^^^^^^^^^^^^^^^^^^^^^ expected *-ptr, found enum `Option`
    |
    = note: expected raw pointer `*const libbpf_bindings::btf_ext`
                      found enum `Option<unsafe extern "C" fn(*mut c_void, *const i8, *mut libbpf_bindings::__va_list_tag) {vdprintf_wrapper}>`

error[E0308]: mismatched types
   --> redbpf/bpf-sys/src/type_gen.rs:201:17
    |
201 |                 ptr::null(),
    |                 ^^^^^^^^^^^ expected enum `Option`, found *-ptr
    |
    = note:     expected enum `Option<unsafe extern "C" fn(*mut c_void, *const i8, *mut libbpf_bindings::__va_list_tag)>`
            found raw pointer `*const _`

The rust binaries run in the namespace of a single container that has the 5.13.0-35-generic kernel.

I guess this error is caused by mismatched version of the git submodule of libbpf. Can you try git submodule update and then build redbpf?

sebastiaoamaro commented 2 years ago

When doing that I got:

  process didn't exit successfully: `/home/sebasamaro/bolsa/kollaps-private/kollaps/emulationcore/redbpf/target/release/build/redbpf-tools-8fc7cba542c4cfa3/build-script-build` (exit status: 101)
  --- stderr
  thread 'main' panicked at 'couldn't compile probes: InvalidLLVMVersion("LLVM version that cargo-bpf linked to (13.0) < LLVM version that rustc depends on (14.0). You should re-build cargo-bpf with LLVM version (14.0), or downgrade rustc that uses LLVM version (13.0)")', redbpf-tools/build.rs:16:10
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Rustc updated to llvm14 recently do not know if the main branch of redbpf already dealt with that, from looking at commits and the .toml(s) I did not see any possibility to build with llvm 14 feature.

I downgraded rust version and updating the submodule did work thanks! Going to see if the initial error remains.

sebastiaoamaro commented 2 years ago

The error persists after 6 pods (6 processes using eBPF) start, it is the same error as before.

thread 'async-std/runtime' panicked at 'called `Result::unwrap()` on an `Err` value: IO(Os { code: 1, kind: PermissionDenied, message: "Operation not permitted" })', redbpf/redbpf/src/load/loader.rs:51:67
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
rhdxmr commented 2 years ago

@sebastiaoamaro Maybe this issue is related to privileged containers?

sebastiaoamaro commented 2 years ago

Yes! It was the issue, weird behavior with some starting and some not will close the issue thanks so much for the help!