Open MOZGIII opened 4 years ago
Interesting. Can you elaborate more on a specific use case? Is it just a safe runtime or are there specific applications where this would be used?
I'm not quite envisioning the benefits of the ebpf
transform vs ebpf
source. @MOZGIII Could you provide some concrete use cases?
I think the ebpf
source fits better into the Vector's paradigm. As @MOZGIII already pointed out, users could deliver their compiled ebpf ELF objects to Vector. A typical ebpf flow consists of intercepting certain symbols in the kernel, but of course not limited to that. Then, ebpf maps are populated with some observability data that can be consumed from the userspace.
So, in my opinion, once the bytecode is accepted by the verifier, Vector could start polling the ebpf maps and producing events. The majority of the heavy-lifting work is performed on behalf of the bpf
syscall, including tasks such as defining maps, loading ebpf programs, etc.
A few observability signals that can be collected by ebpf:
ebpf
source is a different topic. It is also an excellent thing to add, but the idea there is, as correctly explained in the previous message by @rabbitstack, is that we'll be running user-provided bytecode in the linux kernel ebpf VM. We'll be using linux kernel eBPF helpers and API. This provides access to the linux kernel observability signals.
The ebpf
transform is about running user code on Vector-managed VM, rather than kernel-managed VM. We'll provide our own, different API. It's much more similar to wasm
, lua
or remap
transforms - users will write an eBPF program that takes an event - obtained elsewhere - and does something to it. Since this will be our VM, not kernel VM, we'll be exposing different API and entry points, suited for event-transformation needs. This has nothing to do with the linux kernel eBPF implementation, other than that the bytecode format is the same. Different runtime, different API, different purposes.
A concrete example - an eBPF program that takes an event and adds a "hello": "world"
field to it.
Of course, the kernel eBPF VM and ebpf
enable us to access unique data, and are thus, again, very very nice to have in Vector.
However, this issue is about a different thing. It's very easy to get confused, since both matters are about eBPF, but the application and the core concept is very different.
Essentially, we want both - an ebpf
source to get the data from the kernel and user-land hooks, and the ebpf
transorm for a high-efficient in-vector events transformation. Looks like we need to work on explaining the differences between the two concepts better - both internally and to the users in the docs.
Thanks for clarifying, @MOZGIII. You're definitely proposing a transform for executing the ebpf instruction set in userspace. I believe the https://github.com/qmonnet/rbpf crate has the building blocks for achieving this.
We already have
WASM
support, it would be great to also add support for a similar technology - eBPF.A good introduction to the eBPF in the context of the Linux kernel is available at LWN.
The eBPF is a virtual machine that can run eBPF bytecode. It's conceptually similar to WASM, but the details are very different.
One of the important aspects of eBPF is that, due to the original use case of it for network packet filtering, it has some very useful properties that are very suitable for us too: the eBPF bytecode can be verified to be loop-free, and thus to never lock up the transform execution. This is an amazing guarantee to have. The implementation of the eBPF VM can also be quite performant, especially for a use case like our transform, which is very similar to the needs of the original task that eBPF was designed for - packet filtering. They pretty much had all the same headaches that we do - given a packet (in our case - an event) they needed to either pass it through, drop it or alter it, plus optionally emit some new packets and, also optionally, keep some state around. Sounds familiar, doesn't it!
In our implementation, it'll look something like this:
vector
, and addebpf
transform to thevector
config pointing to that filevector
will verify the bytecode, and if it's valid and not rejected - run the eBPF VM with user-provided code and pass the events throughRelated to: #3589
Compared to #3589, this is not about using Linux's eBPF VM and running our code there to gather observability data, but about running an eBPF VM in
vector
, and using it as a transform.