Netflix / bpftop

bpftop provides a dynamic real-time view of running eBPF programs. It displays the average runtime, events per second, and estimated total CPU % for each program.
Apache License 2.0
2.25k stars 94 forks source link

XDP programs are accounted to xdp-dispatcher #15

Open hhoffstaette opened 7 months ago

hhoffstaette commented 7 months ago

Thanks for bpftop! I gave it a try and saw that XDP programs - which are managed by something called xdp-dispatcher - are not accounted indiviually, but rather to the xdp-dispatcher. Is there a way to distinguish BPF statistics between individual XDP programs, or is this something that would have to be fixed in XDP?

jfernandez commented 7 months ago

@hhoffstaette, could you paste the output of sudo bpftool prog? I want to see how that tool aggregates your apps.

hhoffstaette commented 7 months ago

Here you go:

$bpftool prog
2: tracing  name dump_bpf_map  tag 3d73ab123e737406  gpl
    loaded_at 2024-02-27T17:26:40+0100  uid 0
    xlated 280B  jited 166B  memlock 4096B  map_ids 2
    btf_id 86
3: tracing  name dump_bpf_prog  tag a555684b684cb7c0  gpl
    loaded_at 2024-02-27T17:26:40+0100  uid 0
    xlated 520B  jited 647B  memlock 4096B  map_ids 2
    btf_id 86
13: xdp  name xdp_dispatcher  tag 90f686eb86991928  gpl run_time_ns 18735 run_cnt 32
    loaded_at 2024-02-27T17:26:42+0100  uid 0
    xlated 672B  jited 516B  memlock 4096B  map_ids 7
    btf_id 97
22: ext  name homeplug_drop  tag fbd415544de357c1  gpl
    loaded_at 2024-02-27T17:26:42+0100  uid 0
    xlated 136B  jited 91B  memlock 4096B
    btf_id 100
hhoffstaette commented 7 months ago

Sorry for the naming confusion. The CLI tool is called xdp-loader, whereas the XDP manager is called xdp-dispatcher, which is an XDP program itself, see here

$xdp-loader status eth0                               
CURRENT XDP PROGRAM STATUS:

Interface        Prio  Program name      Mode     ID   Tag               Chain actions
--------------------------------------------------------------------------------------
eth0                   xdp_dispatcher    skb      13   90f686eb86991928 
 =>              50     homeplug_drop             22   fbd415544de357c1  XDP_PASS
hhoffstaette commented 7 months ago

I think the problem here is that XDP pipelines are "not really" individual programs, but I'm not familiar enough with the internals and eBPF statistics mechanism, hence the question.

jfernandez commented 7 months ago

Ok, this type of eBPF pattern is new to me. To the Kernel, this will look like just loading one XDP app. At first glance, it seems like the libxdp library loads each individual XDP program as an ext type, and then the dispatcher possibly calls to each one using this feature.

This is where the stats we use are incremented by the Kernel when a BPF program exits. It appears to me that it's not doing that for these ext prog types. https://github.com/torvalds/linux/blob/master/kernel/bpf/trampoline.c#L874

I'm very curious about this, I'll dig into the library a bit more to understand it.

hhoffstaette commented 7 months ago

Thanks! If this turns out to be impossible at the moment we might need to open an issue in xdp-tools and see what Toke thinks.

mscastanho commented 7 months ago

The dispatcher created by libxdp (used by xdp-loader) declares a few global dummy functions to act as the "slots" where XDP programs can be hooked. When a new XDP program needs to be attached, libxdp replaces one of the dummy functions with the code for the XDP program being loaded using the freplace mechanism.

So when using libxdp and the dispatcher, there's a single XDP program (the dispatcher) and each additional user program will be an EXT type eBPF program attached to the dispatcher.

A bit more context on this: https://github.com/xdp-project/xdp-tools/blob/master/lib/libxdp/protocol.org https://ebpf-docs.dylanreimerink.nl/linux/program-type/BPF_PROG_TYPE_EXT/

jfernandez commented 7 months ago

@mscastanho, thanks for the additional context. It appears to me that these EXT type programs do not go through the bpf/trampoline.c path I linked above, and thus, don't get individual performance statistics. If that is the case, then a Kernel patch would be required to enable stats for them. I'll verify that's the limiting factor here and investigate the feasibility of this.

jfernandez commented 7 months ago

@hhoffstaette could you attach an example libxdp app that I can use to verify a few things?

hhoffstaette commented 6 months ago

My only "production" app (a simple ethernet frame filter) is for hardware that you certainly do not have, but there are many examples around:

I hope this helps.

jfernandez commented 5 months ago

I started a thread in the BPF mailing list to discuss this limiation https://lore.kernel.org/bpf/20240407052135.n3vwjrhw22kjehrh@ubuntu/T/#u