iovisor / bcc

BCC - Tools for BPF-based Linux IO analysis, networking, monitoring, and more
Apache License 2.0
20.63k stars 3.89k forks source link

[Proposal] Support compile libbpf-tools to Wasm(WebAssembly) and get tools running from the cloud in 1 line #4306

Open yunwei37 opened 2 years ago

yunwei37 commented 2 years ago

Hi, community!

I propose that maybe we can try to compile the libbpf-tools C code to OCI-compatible WebAssembly module, and then use a launcher to get eBPF programs running from the cloud to the kernel in 1 line of bash?

Just like what bumblebee has done, but with the help of WebAssembly:

As far as I know, Bumblebee seems cannot support running on Arm: Fail installation on ARM and document supported architecture.

We have created a prove-of-concept library and a demo tool to show how to compile libbpf base tools to Wasm and run it:

Background

What is Wasm?

WebAssembly, often shortened to Wasm, is a relatively new technology that allows you to compile application code written in over 40+ languages (including Rust, C, C++, JavaScript, and Golang) and run it inside sandboxed environments.

The original use cases were focused on running native code in web browsers, such as Figma, AutoCAD, and Photoshop. In fact, fastq.bio saw a 20x speed improvement when converting their web-based DNA sequence quality analyzer to Wasm. And Disney built their Disney+ Application Development Kit on top of Wasm! The benefits in the browser are easy to see.

But Wasm is quickly spreading beyond the browser thanks to the WebAssembly System Interface (WASI). Companies like Vercel, Fastly, Shopify, and Cloudflare support using Wasm for running code at the edge, and Fermyon is building a platform to run Wasm microservices in the cloud.

What is WASI?

The WebAssembly System Interface is not a monolithic standard system interface, but is instead a modular collection of standardized APIs. None of the APIs are required to be implemented to have a compliant runtime. Instead, host environments can choose which APIs make sense for their use cases.

The compile workflow

The workflow to compile a libbpf-tool to Wasm maybe like this:

  1. Compile the libbpf-tools bpf object as usual
  2. Generate a special header file for the bpf object with a modified bpftool
  3. Include some special headers in the C code(change the include path of origin libbpf headers to the special headers)
  4. Write the C source code as usual
  5. Compile the C code to Wasm use clang with WASI support
  6. Use a launcher to load the Wasm module in user space: export some helper functions to the WebAssembly runtime, and then load the bpf object into the kernel from the wasm module with the helper functions.

What needs to be done?

What maybe need to be done in libbpf-tools next if we want to support wasm target:

  1. define some ABI interfaces or helpers for the WebAssembly Runtime to load the bpf object into the kernel and interact with it, because WASI does not have any modules for executing eBPF programs. In other words, we need to create an abstraction layer between the WebAssembly Runtime and the libbpf library.
  2. port some library to wasm, like argp to wasm, because WASI only has standard c library supported.
  3. create some special headers to replace the origin libbpf headers, which will convert the libbpf APIs to our ABI interfaces between libbpf and WebAssembly runtime.

Note: We must do data serialization for passing all structured data or class objects between the two worlds of WASM and native(eBPF).

Some small problems may need to be fixed, but I think we may no need to do many code modifications.

some usage examples

Take the sigsnoop tool as an example, we can use the following command to run it from the cloud:

$ sudo ./ecli run sigsnoop  
TIME     PID     COMM             SIG       TPID    RESULT
12:47:32 2268659 node             0         2268239 0     
12:47:32 0       swapper/1        14        847     0     
12:47:32 2554426 cpptools-srv     0         2268736 0     
12:47:32 2268659 node             0         2268239 0     
12:47:32 2211204 YDService        0         2268239 0     
12:47:32 2211204 YDService        0         2268229 0     
12:47:32 2211204 YDService        0         2268185 0     
12:47:32 2211204 YDService        0         2268184 0

Or you can compile the sigsnoop tool locally to Wasm and run it:

$ sudo ./ecli run sigsnoop.wasm
Trace standard and real-time signals.

USAGE: sigsnoop [-h] [-x] [-k] [-n] [-p PID] [-s SIGNAL]

EXAMPLES:
    sigsnoop             # trace signals system-wide
    sigsnoop -k          # trace signals issued by kill syscall only
    sigsnoop -x          # trace failed signals only
    sigsnoop -p 1216     # only trace PID 1216
    sigsnoop -s 9        # only trace signal 
yunwei37 commented 2 years ago

Would you please give me some feedback or suggestions? If you think it's worth trying, I can continue to do more implements and researches on this issue.

yonghong-song commented 2 years ago

What are expected changes in bcc/libbpf-tools? Just makefile options or your have something more. If just makefile options you probably don't need to modify bcc/libbpf-tools makefile as you can just compile libbpf tool with different make options. Due to long-time maintenance, I would like to see what changes are needed in bcc/libbpf-tools. If there are no changes needed, I think you have total freedom to create a webassembly-libbpf repo...

yunwei37 commented 2 years ago

Thanks for the suggestions!

What are expected changes in bcc/libbpf-tools?

I think I may need to change in bcc/libbpf-tools:

And I would create an libbpf-wasm repo outside of bcc/libbpf-tools, which may content the libbpf C headers modified for porting libbpf to wasm, so this would have nothing to do with bcc repo.

The main difficulty in combining wasm and ebpf may be that the memory layout of WASM is different from that of eBPF programs, and the structure of C language cannot be directly mapped, so any transfer structures must be serialized.

For example, the event submitted in the kernel:

    bpf_perf_event_output(ctx, &events, BPF_F_CURRENT_CPU, &data, sizeof(data));

The event cannot be directly read in user space wasm vm like this, event both the wasm code and eBPF code is compiled from C:

static void handle_event(void *ctx, int cpu, void *data, __u32 data_size)
{
    struct str_t *e = data;
    struct tm *tm;
    char ts[16];
    printf("%-9s %-7d %s\n", ts, e->pid, e->str);
}

I will go and prepare for more detail designs and some draft implements later. Thanks for you time!

krisztianfekete commented 1 year ago

Hey, @yunwei37, I am the author of that draft PR (there's another one that's already merged) and a BumbleBee contributor. Let me know if there's any update on this idea, I like it!

yunwei37 commented 1 year ago

This is a minimal working runtime and toolchain example based on libbpf and WAMR, with only 300+ lines in runtime host implementation:

 https://github.com/eunomia-bpf/wasm-bpf

A WebAssembly eBPF library and runtime powered by CO-RE(Compile Once – Run Everywhere) libbpf and WAMR.