Add BTF relocation support in eBPF programs

v-thakkar commented 2 years ago

To be able to support BPF CO-RE, rustc needs to be able to generate the relocations of BTF types across different kernel versions. First part of this work requires adding core::intrinsics::{preserve_access_index, preserve_field_info, preserve_type_info, preserve_enum_value} for the BPF target in rustc. And then making sure/testing that the relocations work well with the userspace part of Aya.

alessandrod commented 2 years ago

Just to clarify: aya supports BTF relocations, see: https://github.com/aya-rs/aya/blob/main/aya/src/obj/btf/relocation.rs. So if you load an ebpf object file with BTF relocations, aya will be able to resolve them.

What doesn't support emitting BTF relocations yet is the rust compiler. So if you compile a rust program for the bpf*-unknown-none targets it will not emit BTF relocations.

GermanCoding commented 2 years ago

What would be needed to get the preserve_access_index, preserve_field_info, preserve_type_info, preserve_enum_value builtins into rustc? I guess some issue/PR over on https://github.com/rust-lang/rust? I would really like to see this feature in Rust, so as to have a really complete CORE solution pretty much equivalent to what you can do in C today. However I don't feel qualified enough to open an issue on the Rust project myself, so maybe someone here has more expertise with this?

l2dy commented 2 months ago

First part of this work requires adding core::intrinsics::{preserve_access_index, preserve_field_info, preserve_type_info, preserve_enum_value} for the BPF target in rustc.

Is it possible to experiment with the new intrinsics with the link_llvm_intrinsics feature in unstable?

What would be needed to get the preserve_access_index, preserve_field_info, preserve_type_info, preserve_enum_value builtins into rustc? I guess some issue/PR over on https://github.com/rust-lang/rust?

Rust language feature requests should be discussed on the internals forum first, and then follow the RFC process if asked to. @GermanCoding If you are really interested in this feature, feel free to ask on the forum.

v-thakkar commented 2 months ago

@l2dy @GermanCoding I think @vadorovsky had some discussions regarding this with the Rust community and there has been some WIP regarding the same. Maybe he can shed more light onto it.

vadorovsky commented 2 months ago

I'm currently trying to achieve that without any changes in Rust compiler - I updated the issue title to generalize it.

Instead, I'm trying to emit these intrinsic calls in https://github.com/aya-rs/bpf-linker, which as a bitcode linker, is alreaady capable of modifying LLVM IR before producing the actual BPF binary.

Clang is producing the @llvm.preserve.array.access.index, @llvm.preserve.struct.access.index and @llvm.preserve.array.access.index intrinsics by performing regular calls of LLVM API, the code is here:

https://github.com/llvm/llvm-project/blob/cb6a62369a353f506a1dde087eeaf5ebea5d5c26/llvm/lib/IR/IRBuilder.cpp#L1260-L1285

The idea is to do the same in bpf-linker whenever we find a getelementptr instruction followed by load which is done on a type, for which we would like to emit BTF relocations.

Clang has two ways of enabling BTF relocations:

For the whole type, by annotating it with __attribute__((preserve_access_index)), e.g.

struct foo {
  int a;
  int b;
  int c;
  int d;
} __attribute__((preserve_access_index));

int get_a(struct foo *foo) {
  return foo->a;  // <-- this already does BTF relocations
}

For a single field, if it's being accessed with __builtin_preserve_access_index, e.g.

struct foo {
  int a;
  int b;
  int c;
  int d;
}

int get_a(struct foo *foo) {
  return __builtin_preserve_access_index(foo->a); // <- this does BTF relocation, but the
                                                  // intrinsics are emitted just for field `a`
}

How we could achieve equivalent way of notifying bpf-linker about necessity of emitting the intrinsic for the given type or field? We can use a custom PhantomData field which won't affect the layout of the type.

For the whole type, we could look for _btf_marker: PhantomData<()>, e.g.

#[repr(C)]
pub struct Foo {
    a: i32,
    b: i32,
    c: i32,
    d: i32,
    _btf_marker: PhantomData<()>,
}

For a single field, we could look for _btf_marker_FIELD_NAME: PhantomData<()>, e.g.

#[repr(C)]
pub struct Foo {
    a: i32,
    b: i32,
    c: i32,
    d: i32,
    _btf_marker_a: PhantomData<()>,
}

Of course the code examples above are ugly and we don't want users to write such code. We could hide them with macros and make the final code look like:

// This would emit BTF relocs for the whole type
#[repr(C)]
#[derive(Btf)]
pub struct 
    a: i32,
    b: i32,
    c: i32,
    d: i32,
}

// This would emit BTF relocs only for annotated fields
#[repr(C)]
#[derive(Btf)]
pub struct 
    a: i32,
    b: i32,
    c: i32,
    #[btf]
    d: i32,
}

To be more explicit, the plan going forward is:

Necessary change in LLVM-C API which allows to iterate over dbg_records of IR instructions (that's the only reliable way how bpf-linker can figure out that the given getelementptr instruction refers to a certain debug info node)
- [ ] https://github.com/llvm/llvm-project/pull/107802
Preparatory, necessary refactors in bpf-linker (otherwise the code for handling BTF would be horrible)
[ ] structs <-- I'm working on this step
- find getelementptr instructions followed by load
- check if it's a struct
- check whether the struct, or one of its fields, is annotated with a custom marker _btf_marker: PhantomData<()> or `_btf_marker_FIELD_NAME: PhantomData<()>
- if found, emit an @llvm.preserve.struct.access.index intrinsic call
[ ] enums
- find getelementptr instructions followed by load
- check if it's an enum
- check whether the enum, or one of its fields, has a variant with a custom marker BtfMarker
- this ugly marker can be hidden by a macro, details below
- [ ] unions
[ ] Provide macros in Aya
- [ ] #[derive(Btf)] macro, which either:
  - adds _btf_marker: PhantomData<()> for the whole type
  - adds _btf_marker_FIELD_NAME: PhantomData<()>
- [ ] #[btf] macro for enums, which adds
- [ ] bpf_core_read! macro BtfMarker

canoriz commented 1 month ago

I'm new to aya. Does this mean an eBPF ELF object compiled from rust may not be CO-RE (because lack of index in ELF), and we are working on it now?

vadorovsky commented 3 weeks ago

I'm new to aya. Does this mean an eBPF ELF object compiled from rust may not be CO-RE (because lack of index in ELF), and we are working on it now?

Sorry for late reply. Yes, that exactly the case.

aya-rs / aya

Add BTF relocation support in eBPF programs #349