oracle / bpftune

bpftune uses BPF to auto-tune Linux systems
Other
654 stars 55 forks source link

Kernel 6.9: No tcp_buffer_tuner.so or net_buffer_tuner.so. Invalid argument. #86

Closed HippieMitch closed 1 month ago

HippieMitch commented 1 month ago

After updating from 6.8.9 to 6.9, bpftune no longer functions throwing "Invalid arguments" regarding tcp_buffer_tuner and net_buffer_tuner.

Here is the systemd log:

May 16 22:06:57 spaceghost bpftune[2110]: could not load skeleton: No such process May 16 22:06:58 spaceghost bpftune[2110]: could not load skeleton: Invalid argument May 16 22:06:58 spaceghost bpftune[2110]: could not load skeleton: Invalid argument May 16 22:06:58 spaceghost bpftune[2110]: error initializing '/nix/store/37a5n11qibghdxd3k60imjkrw0vai01w-bpftune-unstable-2023-12-20/lib/bpftune//tcp_buffer_tuner.so: Invalid argument May 16 22:06:58 spaceghost bpftune[2110]: could not load skeleton: No such process May 16 22:06:58 spaceghost bpftune[2110]: error initializing '/nix/store/37a5n11qibghdxd3k60imjkrw0vai01w-bpftune-unstable-2023-12-20/lib/bpftune//net_buffer_tuner.so: No such process May 16 22:06:59 spaceghost bpftune[2110]: could not load skeleton: No such process May 16 22:07:00 spaceghost bpftune[2110]: could not load skeleton: Invalid argument May 16 22:07:00 spaceghost bpftune[2110]: could not load skeleton: Invalid argument May 16 22:07:00 spaceghost bpftune[2110]: error initializing '/nix/store/37a5n11qibghdxd3k60imjkrw0vai01w-bpftune-unstable-2023-12-20/lib/bpftune//tcp_buffer_tuner.so: Invalid argument

alan-maguire commented 1 month ago

thanks for the report. I've reproduced this on 6.9 by running "sudo bpftune -ds" to get debug output. we're seeing a verification error with the tcp buffer tuner, specifically:

42: (15) if r9 == 0x0 goto pc+327 ; R9=scalar(umin=1) ; sk_userlocks = sk->sk_userlocks; @ tcp_buffer_tuner.bpf.c:246 43: failed to resolve CO-RE relocation [34] struct sock.sk_userlocks (0:46 @ offset 512.4) processed 44 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 1 -- END PROG LOAD LOG --

... and for net_buffer_tuner we see:

bpftune: reusing netns_map fd 16 bpftune: libbpf: object 'net_buffer_tune': failed (-1) to create BPF token from '/sys/fs/bpf', skipping optional step... bpftune: libbpf: loaded kernel BTF from '/sys/kernel/btf/vmlinux' bpftune: libbpf: extern 'netdev_max_backlog' (strong): not resolved bpftune: libbpf: failed to load object 'net_buffer_tuner_bpf' bpftune: libbpf: failed to load BPF skeleton 'net_buffer_tuner_bpf': -3

for the first issue, the problem is sk_userlocks changes from a bitfield to a plain u8 and the vmlinux.h that was generated had the old representation. We can work around this via the following:

diff --git a/src/tcp_buffer_tuner.bpf.c b/src/tcp_buffer_tuner.bpf.c index e8b3fd1..8002cfc 100644 --- a/src/tcp_buffer_tuner.bpf.c +++ b/src/tcp_buffer_tuner.bpf.c @@ -242,8 +242,11 @@ BPF_FENTRY(tcp_rcv_space_adjust, struct sock *sk) return 0;

ifndef BPFTUNE_LEGACY

For the second issue, netdev_max_backlog became net_hotdata.max_backlog so the ksym lookup fails. investigating a fix for that one...

alan-maguire commented 1 month ago

I pushed a fix to main for tcp buffer issue, should have a fix for netdev_max_backlog issue ready soon. solution there is not to rely on the ksym at all but use a bpf global var to store the netdev_max_backlog sysctl value and update it when we change it.

alan-maguire commented 1 month ago

this is merged into main now too @HippieMitch can you rebuild bpftune with latest main branch and check if these issues go away? they do for me and test suite passes but want to ensure you see the same before closing this out. thanks!

HippieMitch commented 1 month ago

That seems to have got it sorted. Thanks!