elf: support weak symbols / symbol visibility

lmb commented 3 years ago

libbpf has added support for weak symbols, which end up in ELF with an STB_WEAK binding.

libbpf commits that might be relevant due to mentioning STB_WEAK:

https://github.com/torvalds/linux/commit/166750bc1dd256b2184b22588fb9fe6d3fbb93ae (adds LINUX_KERNEL_VERSION)
https://github.com/torvalds/linux/commit/386b1d241e1b975a239d33be836bc183a52ab18c

linked_maps.c fails due to:

struct {
    __uint(type, BPF_MAP_TYPE_ARRAY);
    __type(key, int);
    __type(value, int);
    __uint(max_entries, 16);
} map_weak __weak SEC(".maps");

linked_funcs.c:

/* this weak instance should win because it's the first one */
__weak int set_output_weak(int x)
{
    output_weak1 = x;
    return x;
}

There is also the following, not sure if that needs additional code:

/* here we'll force set_output_ctx2() to be __hidden in the final obj file */
__attribute__((visibility("hidden"))) extern void set_output_ctx2(__u64 *ctx);

dylandreimerink commented 9 months ago

I did a bit of research on this topic. Libbpf did quite some work around __weak, which seems to mainly driven by their linker usecase bpftool gen object which needs to combine multiple object files and apply rules around symbol replacement.

Since we currently don't have any need for logic to do this, we can simply treat STB_WEAK symbols as global symbols, which in practice seems to be a small change.

During the investigation, I tried to replicate some of the kernels selftests where I did discover a few shortcomings:

We don't support __ksym variables, while libbpf does.
We error on missing kfuncs. Kernel tests indicate libbpf sets the kfunc to NULL.
We don't fixup kfunc relos unless its a call instruction. Libbpf also allows for loading the address with a ldimm64 instruction.

These are independent of the handling of weak symbols, but I wanted to note them here so I don't forget. Shall I turn these into separate issues?

I still need to look into the effect of __hidden

lmb commented 9 months ago

Shall I turn these into separate issues?

Yes please!

lmb commented 9 months ago

we can simply treat STB_WEAK symbols as global symbols, which in practice seems to be a small change.

I'd probably prefer to not allow WEAK in that case. That leaves the door open to later on add support for linking in some form. I've long thought that it could be very useful to be able to ship BPF in a go module somehow, and then express BPF library dependencies via Go modules (with linking via WEAK).

dylandreimerink commented 9 months ago

I'd probably prefer to not allow WEAK in that case.

I would have to confirm, but I believe that even after linking with bpftool gen object, the symbols are marked as weak, which would exclude the usage of the feature.

Additionally, even if we don't want to support it for maps, ksym variables, or bpf-to-bpf functions, I think we do want to support it for kfuncs, see the use case described in #1355. Lucky that is a 1 line change.

dylandreimerink commented 9 months ago

I conducted an experiment, made 2 simple programs:

Weak 1

__weak int weak_func(int a) {
    a = a + 1;
    a = a * 7;
    a = a - 1;
    return a;
}

SEC("xdp") int xdp_prog(struct xdp_md *ctx) {
    return weak_func(123);
}

Weak 2

__weak int weak_func(int a) {
    a = a + 2;
    return a;
}

SEC("xdp") int xdp_prog2(struct xdp_md *ctx) {
    return weak_func(123);
}

Then combined them with bpftool gen object weak.o weak_1.o weak_2.o

Which yields:

> llvm-objdump -r -d weak.o

weak.o: file format elf64-bpf

Disassembly of section .text:

0000000000000000 <weak_func>:
       0:       bf 10 00 00 00 00 00 00 r0 = r1
       1:       27 00 00 00 07 00 00 00 r0 *= 0x7
       2:       07 00 00 00 06 00 00 00 r0 += 0x6
       3:       95 00 00 00 00 00 00 00 exit
       4:       bf 10 00 00 00 00 00 00 r0 = r1
       5:       07 00 00 00 02 00 00 00 r0 += 0x2
       6:       95 00 00 00 00 00 00 00 exit

Disassembly of section xdp:

0000000000000000 <xdp_prog>:
       0:       b7 01 00 00 7b 00 00 00 r1 = 0x7b
       1:       85 10 00 00 ff ff ff ff call -0x1
                0000000000000008:  R_BPF_64_32  weak_func
       2:       95 00 00 00 00 00 00 00 exit

0000000000000018 <xdp_prog2>:
       3:       b7 01 00 00 7b 00 00 00 r1 = 0x7b
       4:       85 10 00 00 ff ff ff ff call -0x1
                0000000000000020:  R_BPF_64_32  weak_func
       5:       95 00 00 00 00 00 00 00 exit

So what bpftool does is it combines the weak functions in the same order in which they were specified on the CLI tool.

If we look at it with metadata, you can see that there is only 1 symbol but 2 BPF funcs. Also the relo entries are still STB_WEAK even after linking.

xdp_prog:
          ; return weak_func(123);
         0: MovImm dst: r1 imm: 123
         1: Call -1 <weak_func>
          ; return weak_func(123);
         2: Exit
weak_func:
          ; __weak int weak_func(int a) {
         3: MovReg dst: r0 src: r1
          ; a = a * 7;
         4: MulImm dst: r0 imm: 7
          ; a = a - 1;
         5: AddImm dst: r0 imm: 6
          ; return a;
         6: Exit
          ; __weak int weak_func(int a) {
         7: MovReg dst: r0 src: r1
          ; a = a + 2;
         8: AddImm dst: r0 imm: 2
          ; return a;
         9: Exit

Loading without modifications results in load program: invalid argument: number of funcs in func_info doesn't match number of subprogs (1 line(s) omitted)

So if we want to support __weak functions after linking we have to make some logic to delete any duplicates found after the first Func info.

cilium / ebpf

elf: support weak symbols / symbol visibility #466