retis-org / retis

Tracing packets in the Linux networking stack & friends
https://retis.readthedocs.io/en/stable/
100 stars 14 forks source link

Use bindgen for common structures and types #419

Closed vlrpl closed 3 weeks ago

vlrpl commented 3 months ago

So far we manage common definitions by keeping them aligned manually. This is inconvenient other than error-prone.

This PR is far from being complete and only aims to be a proposal to improve this aspect. The current strategy consists of annotating types we need to redefine in Rust and generating the Rust types through the new gen-bindings target. The general idea is to commit those generated bindings, but it's perfectly possible to refresh them at every fresh build (in this case they should not live under src/ but under something like env!("OUT_DIR") or any other .gitignored dir.

Notes:

atenart commented 3 months ago

This PR is far from being complete and only aims to be a proposal to improve this aspect.

nit: Could you mark it as "RFC" so it doesn't get merged before being complete?

The current strategy consists of annotating types we need to redefine in Rust and generating the Rust types through the new gen-bindings target. The general idea is to commit those generated bindings, but it's perfectly possible to refresh them at every fresh build (in this case they should not live under src/ but under something like env!("OUT_DIR") or any other .gitignored dir.

I think having a separate make target and including the generated bindings in the sources is a good approach, as it makes the build to be 100% reproducible and allows to review the generated bindings before merging.

* `bindgen` recursively generates bindings for `--allowlist-item`, so if another include defines a `__binding` that is referenced in another header file, the type will be generated twice.

Does that means if types A and B are __binding and B is also a field in A, the binding for B will be generated twice?

* `bindgen` cannot be fed with multiple C header files (a wrapping header that includes multiple ones is required for this), nor can it append new definitions into an already existing file.

I guess we can come up with a expectation on where should be defined types used for the BPF <> Rust communication. Like what we do for BPF hooks and BPF includes. Eg. not a single header but expected paths/files that we can either use directly or include somewhere automatically when generating bindings.

Some high level comments:

Overall I think this will be a nice improvement.

vlrpl commented 3 months ago

This PR is far from being complete and only aims to be a proposal to improve this aspect.

nit: Could you mark it as "RFC" so it doesn't get merged before being complete?

The current strategy consists of annotating types we need to redefine in Rust and generating the Rust types through the new gen-bindings target. The general idea is to commit those generated bindings, but it's perfectly possible to refresh them at every fresh build (in this case they should not live under src/ but under something like env!("OUT_DIR") or any other .gitignored dir.

I think having a separate make target and including the generated bindings in the sources is a good approach, as it makes the build to be 100% reproducible and allows to review the generated bindings before merging.

That's the reason I was considering this approach. The only downside is that the clang args passed to bindgen include the arch type (for our vmlinux.h), so we might end up generating arch-specific bindings which by all means will be identical. I'm not sure this is really a bad thing though.

* `bindgen` recursively generates bindings for `--allowlist-item`, so if another include defines a `__binding` that is referenced in another header file, the type will be generated twice.

Does that means if types A and B are __binding and B is also a field in A, the binding for B will be generated twice?

yes, twice but in two separate files. E.g. given

/foo/path/bpf/fileA.c::BindingA and /foobar/path/bpf/fileB.c::BindingB we'll end up having:

If we don't annotate BindingB, only fileA_uapi.rs should be generated with both types.

* `bindgen` cannot be fed with multiple C header files (a wrapping header that includes multiple ones is required for this), nor can it append new definitions into an already existing file.

I guess we can come up with a expectation on where should be defined types used for the BPF <> Rust communication. Like what we do for BPF hooks and BPF includes. Eg. not a single header but expected paths/files that we can either use directly or include somewhere automatically when generating bindings.

Some high level comments:

* Having annotations and a script to get the list of what types should be generated is nice. I would also be OK with an explicitly defined list somewhere (could be in the `Makefile`).

I thought about a list, but if the script doesn't show significant drawbacks, it's probably handier from the dev standpoint. One limitation the approach has is that we might probably need to turn #define into const ... as of course we cannot annotate macros.

* I don't think that will be an issue, but could you check with [Benchmark: build the raw event instead of using a static slice #412](https://github.com/retis-org/retis/pull/412) to see if there is any incompatibility between the two efforts?

yep, makes sense to have a deeper look at it

* Not really related, but IMO this show why we should get rid of `ebpf.mk` and have everything in the top-level `Makefile`.

Overall I think this will be a nice improvement.

atenart commented 3 months ago

The current strategy consists of annotating types we need to redefine in Rust and generating the Rust types through the new gen-bindings target. The general idea is to commit those generated bindings, but it's perfectly possible to refresh them at every fresh build (in this case they should not live under src/ but under something like env!("OUT_DIR") or any other .gitignored dir.

I think having a separate make target and including the generated bindings in the sources is a good approach, as it makes the build to be 100% reproducible and allows to review the generated bindings before merging.

That's the reason I was considering this approach. The only downside is that the clang args passed to bindgen include the arch type (for our vmlinux.h), so we might end up generating arch-specific bindings which by all means will be identical. I'm not sure this is really a bad thing though.

IMO we should try to avoid having architecture-specific bindings. The reason we have different vmlinux.h files is given the amount of things defined in there, there are some architecture-specific ones. But for our raw events, I think we can try to only use sized types and then only generate the bindings once. I'm not sure if we can automatically make bindgen fail non problematic types, but in any way we should be able to see this at review time / see a failure in the CI.

* `bindgen` recursively generates bindings for `--allowlist-item`, so if another include defines a `__binding` that is referenced in another header file, the type will be generated twice.

Does that means if types A and B are __binding and B is also a field in A, the binding for B will be generated twice?

yes, twice but in two separate files. E.g. given

/foo/path/bpf/fileA.c::BindingA and /foobar/path/bpf/fileB.c::BindingB we'll end up having:

* `/foo/path/fileA_uapi.rs` with both `BindingA` and `BindingB`

* `/foo/path/fileB_uapi.rs` with `BindingB` only

If we don't annotate BindingB, only fileA_uapi.rs should be generated with both types.

OK. I guess including a __binding into another one is not something we'll often see. We can live with that.

vlrpl commented 2 months ago

An older PR manipulates the data structures and also introduces a macro to unify raw events. The macro has to be essentially reverted, but the number of conflicts will make me very happy when I'll be fixing all the problems :) That being said, I'm in favor of getting rid of the packed attribute (repr(c) is enough to ensure stability and a careful arrangement significantly reduces padding). There's an old PR that did that and also used zerocopy for the conversion (but I'll not revive it anytime soon as this is already boring enough :)

That aside, this is half of an RFC, but there are a couple of things I would like to clarify before declaring it official. ATM, given a file called bpf/a.{bpf.c,h} we generate a_uapi.rs essentially creating it one level above bpf/. We then include those bindings with include_binding!() and consume the content. This leads to many scattered files around (we have a lot of common data that makes the conversion extremely annoying :). I'm unsure whether this is the best approach or not. Probably, having some directory under src/bindings/ would help to make things a little cleaner, but we should play well while exporting them (not a big deal anyway, even if we export everything crate-wide).

Any preference/opinion is welcome.

vlrpl commented 2 months ago

The bindings/ approach was preferred and proved to be good. Most of the common code got converted in this PR, but things are getting unmanageable so it's easy enough for anyone to migrate the remaining code in a follow-up.

The bindings are not arch-dependent, so they can be generated from any at the moment. Please be careful potentially requesting changes that can give birth to a chain of conflicts (unless, of course, they are important :)

This turned out to be needed, but it's extremely boring (just wanted to reiterate it :D)

vlrpl commented 2 months ago

The current strategy consists of annotating types we need to redefine in Rust and generating the Rust types through the new gen-bindings target. The general idea is to commit those generated bindings, but it's perfectly possible to refresh them at every fresh build (in this case they should not live under src/ but under something like env!("OUT_DIR") or any other .gitignored dir.

I think having a separate make target and including the generated bindings in the sources is a good approach, as it makes the build to be 100% reproducible and allows to review the generated bindings before merging.

That's the reason I was considering this approach. The only downside is that the clang args passed to bindgen include the arch type (for our vmlinux.h), so we might end up generating arch-specific bindings which by all means will be identical. I'm not sure this is really a bad thing though.

IMO we should try to avoid having architecture-specific bindings. The reason we have different vmlinux.h files is given the amount of things defined in there, there are some architecture-specific ones. But for our raw events, I think we can try to only use sized types and then only generate the bindings once. I'm not sure if we can automatically make bindgen fail non problematic types, but in any way we should be able to see this at review time / see a failure in the CI.

of course, parameters and a bunch of other stuff. As long as we don't use architecture-specific flavors of data structures we should be fine here. We just need to generate them and commit regardless of the host's arch.

* `bindgen` recursively generates bindings for `--allowlist-item`, so if another include defines a `__binding` that is referenced in another header file, the type will be generated twice.

Does that means if types A and B are __binding and B is also a field in A, the binding for B will be generated twice?

yes, twice but in two separate files. E.g. given /foo/path/bpf/fileA.c::BindingA and /foobar/path/bpf/fileB.c::BindingB we'll end up having:

* `/foo/path/fileA_uapi.rs` with both `BindingA` and `BindingB`

* `/foo/path/fileB_uapi.rs` with `BindingB` only

If we don't annotate BindingB, only fileA_uapi.rs should be generated with both types.

OK. I guess including a __binding into another one is not something we'll often see. We can live with that.

vlrpl commented 1 month ago

rebased on top of current main

vlrpl commented 1 month ago

yet another rebase-on-top-of-current-main fun :)

vlrpl commented 1 month ago

Very nice!

thanks for taking a look.

I think we now should add a step in the CI to make sure no UAPI change in the BPF side got unnoticed and not propagated to bindgen Rust definitions.

The reason I didn't include it (at least so far) was mostly because different versions of bindgen may result in [slightly] different code generation. This would force us to generate the bindings following the same version the CI uses (or check with the same version we used to generate them). The latter is probably better, but I'm unsure how straightforward/convenient this is.

nit: commit title "nft: declare and generate bindingsk" has a small typo.

atenart commented 1 month ago

I think we now should add a step in the CI to make sure no UAPI change in the BPF side got unnoticed and not propagated to bindgen Rust definitions.

The reason I didn't include it (at least so far) was mostly because different versions of bindgen may result in [slightly] different code generation. This would force us to generate the bindings following the same version the CI uses (or check with the same version we used to generate them). The latter is probably better, but I'm unsure how straightforward/convenient this is.

I see. Maybe forcing to update the bindings in such case would be OK though. It's easy to do when the above happens and helps preventing issues in the UAPI?

vlrpl commented 1 month ago

I think we now should add a step in the CI to make sure no UAPI change in the BPF side got unnoticed and not propagated to bindgen Rust definitions.

The reason I didn't include it (at least so far) was mostly because different versions of bindgen may result in [slightly] different code generation. This would force us to generate the bindings following the same version the CI uses (or check with the same version we used to generate them). The latter is probably better, but I'm unsure how straightforward/convenient this is.

I see. Maybe forcing to update the bindings in such case would be OK though. It's easy to do when the above happens and helps preventing issues in the UAPI?

I'm not sure. Our bindings are fairly simple and should be stable. Updating once in a while should be ok (maybe using some stable version?).

If the container (current or future) ships a different version (and no real changes are present) we'd end up failing because of:

-/* automatically generated by rust-bindgen 0.69.4 */
+/* automatically generated by rust-bindgen 0.70.1 */

and every time the version changes this will happen. Of course, we can exclude such a false positive, but I would avoid chasing this kind of CI breakage because of potential false positive (like a typo, or different wording in the comment emitted by bindgen :) unless this is really beneficial (actual bugs are prevented/fixed).

I guess we could use the +stable toolchain (assuming bindgen-cli version follows +stable), but +stable has to be part of rust:bookworm (I don't know if it's the case currently), otherwise we should install it manually and then use it for our gen-binding target.

atenart commented 1 month ago

OK, let's no do this for now. The version strings is not ideal as you noted.

vlrpl commented 1 month ago

no further updated planned for this PR

atenart commented 1 month ago

Actually just found an issue which is not in main. Loading all (but ovs) collectors, I get the following error now:

ERROR [eBPF] Failed to get event section: no space left
vlrpl commented 4 weeks ago

Actually just found an issue which is not in main. Loading all (but ovs) collectors, I get the following error now:

ERROR [eBPF] Failed to get event section: no space left

the message is correct, and this happens because we hit event data exhaustion. There's a way to solve this without increasing the data size, but this will be unavoidable if we add more sections.

vlrpl commented 4 weeks ago

Pushed for a run of CI. Some final work is required for the new alignment part.

vlrpl commented 3 weeks ago

I went for the split, the risk of conflicts is fairly high with this.