qmonnet / rbpf

Rust virtual machine and JIT compiler for eBPF programs
Apache License 2.0
922 stars 235 forks source link

Add a JIT based on Cranelift #86

Closed afonso360 closed 1 year ago

afonso360 commented 1 year ago

👋 Hey,

I've been playing around with translating eBPF and running it with Cranelift as a weekend project. Cranelift is a Rust based optimizing compiler that includes a JIT.

This is a prototype and it is still a bit rough around the edges, but maybe you'd be interested in having this upstream?

This can also help with #21 and #48 since Cranelift supports x86_64, AArch64, s390x and RISC-V, as well as Windows, Linux, MacOS and some BSD's.

(I accidentally ran rustfmt on src/lib.rs, but If you'd like me to polish this PR and upstream it, I'll revert that change)

qmonnet commented 1 year ago

Thanks and apologies for the delay, I'll try to find some time to look into this soon :pray:

qmonnet commented 1 year ago

Personal business has kept me mostly away from GitHub last week... I'll do my best this week, apologies again

qmonnet commented 1 year ago

So this is a lot of work, thanks for this PR!

To be honest with you, I'm not really sure what to do of it. We already have a JIT-compiler in rbpf, and I'm not looking forwards to maintain a separate implementation. This being said, support for the various architectures and platforms you mentioned is definitely something interesting.

I'm thinking of accepting the PR, especially if you use it on your side, but I'd have the feature opt-in rather than opt-out (my understanding is that default = ["cranelift"] in the manifest currently enables it by default, right?). Would it be OK for you? I hope I haven't killed your motivation with the review delay, apologies once again.

I've got a few questions:

Some observations:

afonso360 commented 1 year ago

Hey, Thanks for reviewing this!

We already have a JIT-compiler in rbpf, and I'm not looking forwards to maintain a separate implementation.

That is completely reasonable, and mostly the reason I posted such an early draft (with build warnings!).

This really is a weekend project, and I used it to learn a bit more about how eBPF works. I don't currently use it anywhere or have any real plans for using it.

I'm thinking of accepting the PR, especially if you use it on your side, but I'd have the feature opt-in rather than opt-out (my understanding is that default = ["cranelift"] in the manifest currently enables it by default, right?). Would it be OK for you?

Yeah, that makes sense.

You mentioned a "weekend project". Do you have a concrete use case for this PR, or was it simply an experiment?

Not really, it really was just an experiment into learning how eBPF works.

How does it compare with the existing JIT, in particular, is it at feature parity? (I can check as well, but haven't been through all the code and tests yet).

I don't think we are at feature parity.

The test file for cranelift is pretty much a copy from the existing JIT's tests and we pass all of those, however the cranelift JIT doesn't yet implement the mbuf load/store instructions, that being said, it shouldn't be very difficult to add them.

From reading the existing JIT I think the only extra feature we implement is that we do bounds checks on all loads and stores. Additionally we use some of the existing cranelift features such as constant propagation and some other optimizations.

This mostly means that some of the test cases end up being optimized down to a couple of instructions. (As an aside this ended up revealing a missing optimization in cranelift!)

What platform/arch have you tested so far?

Natively I've only tested on x86_64 Linux. I've also ran the testsuite for AArch64 and RISC-V under QEMU.

RISC-V failed due to an instruction not yet being implemented on cranelift, but that has since been fixed in the latest version and the testsuite passes with that version.

I haven't tested s390x, or any other OS. That being said, they should be supported.

Some observations:

Those make sense! This really is a sort of early draft which I was planning on fixing up if you decide you'd like this upstreamed. Let me know!

And I completely understand that this might be too much of a maintenance burden! Especially for something that already pretty much exists.

qmonnet commented 1 year ago

Thanks for the details! Nice to have validation on several architectures already :muscle:.

OK let's go for an opt-in feature then. I've also thought of keeping it in a separate branch, but the project is not active enough and the feature would likely end up dying there. At least in main it can be discovered and tested by users.

Would be nice to get the load/stores with the mbuf too in that case. And good to have some additional optimisations, I must admit I haven't spent time looking into that.

One other question, I remember seeing that you build the CFG for the program at some point in the code. What is it for?

As an aside this ended up https://github.com/bytecodealliance/wasmtime/pull/6683!

Pretty neat!

afonso360 commented 1 year ago

👍 Sounds good, I'll clean it up a bit and add those missing features.

One other question, I remember seeing that you build the CFG for the program at some point in the code. What is it for?

Cranelift's IR builder does not allow us to go back and edit a block after the branch / terminator instruction has been inserted (Called a "filled block" in cranelift's terminology). So what happens is that when we have a backwards branch into the middle of a previous block, we no longer can go back, split that previous block in two and emit a jump into it.

So I've built a two pass solution, first we scan and mark all offsets where we either land from another instruction, or we jump into another instruction, and emit the IR blocks according to that. When translating the instructions into IR, we switch blocks according to that previous scan.

afonso360 commented 1 year ago

I've cleaned up this PR a little bit. I've also made the following changes:

I'd appreciate a review especially on the mbuf handling part, which I'm not entirely sure is correct. We store either the mbuf or the mem pointer in r1 based on the length argument, which is what seems to be done in the interpreter. But the existing JIT does something slightly more complicated which I'm not sure I fully understand.

Edit: There are also quite a few WIP commits on this branch, should I squash this into a single commit?

qmonnet commented 1 year ago

Apologies for the delay, again... This time I must admit that I completely forgot to come back to this during the week :grimacing: but I've started, now.

There are also quite a few WIP commits on this branch, should I squash this into a single commit?

Not necessarily a single one, ideally logical commits (New JIT addition, adding the feature, enabling in CI, ...).

CI is happy, this is great!

afonso360 commented 1 year ago

Apologies for the delay, again... This time I must admit that I completely forgot to come back to this during the week 😬 but I've started, now.

No worries! I'm not in a hurry with this, so take as long as you'd like.

Not necessarily a single one, ideally logical commits (New JIT addition, adding the feature, enabling in CI, ...).

Cool, I'll rework those commits into something slightly more reasonable.

qmonnet commented 1 year ago

No worries! I'm not in a hurry with this, so take as long as you'd like.

I'm taking you at your word :sweat_smile:. This time I started to look more into it though.

My main comment would be that we could clean up the commits a bit more - I started to comment on the PR, but I realised I was maybe being too picky and imposing too much work on you. So instead I cleaned it up on my side and created another branch (I didn't want to overwrite yours).

So would you mind checking whether https://github.com/qmonnet/rbpf/tree/pr/cranelift-jit-reworked is OK with you, and if so, updating the current PR? Changes include:

I think that's about it. I checked that the tests are still passing at each commit, so hopefully I didn't break your PR too much while reorganising the contents between the different commits :slightly_smiling_face:.

I note there's a test which is commented out in tests/cranelift.rs, is there a particular reason for that?

afonso360 commented 1 year ago

My main comment would be that we could clean up the commits a bit more - I started to comment on the PR, but I realised I was maybe being too picky and imposing too much work on you. So instead I cleaned it up on my side and created another branch (I didn't want to overwrite yours).

That's awesome! Thanks! Your branch looks great to me!

I note there's a test which is commented out in tests/cranelift.rs, is there a particular reason for that?

Not really, I just kinda forgot about it 🙃 . I've now updated commit 83146ebd6569dad431c4115de604f2293ba0b2c6 to include that test, and also some better error messages when we don't find the required helper.

afonso360 commented 1 year ago

OK, I've been through the changes one more time and I think we're good 🎉

🎉

Do you have any more changes to bring to the PR, or shall I merge it?

Not right now, I might send a follow up PR soon to update the cranelift version when the new one is out. But It's probably not worth waiting for a release.

qmonnet commented 1 year ago

I might send a follow up PR soon to update the cranelift version when the new one is out. But It's probably not worth waiting for a release.

OK. No need to update for the sake of it though, no need to update if there's no benefit in it (perf improvement, relevant security fixes, etc.).

We should also update the REAMDE.md at some point to mention the feature and document the API functions. I can look into it when I find the time, if you don't beat me to it.