libbpf / libbpf-rs

Minimal and opinionated eBPF tooling for the Rust ecosystem
Other
741 stars 129 forks source link

Allow for cloning or owning of major data types #448

Closed GermanCoding closed 1 year ago

GermanCoding commented 1 year ago

Hello,

This is more of an API design question, though I couldn't find any contact information for a discussion. Thus I'm writing this as an issue.

I'm currently evaluating libbpf-rs for production usage. I've previously looked at Aya, but it's lacking BTF support - which is increasingly becoming more and more important.

While trying out a prototype conversion from Aya to libbpf-rs, the most pain I have is that libbpf-rs uses a lot of references. For many structs, it appears impossible to obtain ownership of them.

For example, if I want to work with a map, I can call libbpf_rs::Object::map_mut(), which gives me a mutable reference to that map. I can then add, remove or lookup elements. This is fine for small projects, i.e. if all of my program logic is in main().

However, on a larger codebase, I'm working with a number of maps and I possibly want references to multiple maps at the same time. Due to Rusts borrow checker, this is impossible with libbpf-rs: I cannot call map_mut() twice (without dropping the map reference first), since that would require two mutable borrows of the Object.

This makes working with maps really painful, as I'm pretty much forced to pass around an Arc<Mutex<Object>>-style object to have multiple ownership of the Object and then get temporary references to the maps and drop them immediately afterwards.

I found a small hack to circumvent this: I can effectively clone a map by doing something like this: libbpf_rs::Map::from_map_id(object.info()?.info.id)? (i.e. creating a new Map from an existing map ID). This feels like a hack around the underlying problem though - if this were intended API usage, Map should implement Clone by itself. The omission of Clone on pretty much everything feels like it was intentional. But I'm having trouble on designing a clean API around libbpf-rs, since it's heavy usage of references makes it difficult to juggle multiple programs, maps and stuff around.

I've seen that libbpf-rs currently has this statement in its code for the Bpf object:

/// An `Object` is logically in charge of all the contained [`Program`]s and [`Map`]s as well as
/// the associated metadata and runtime state that underpins the userspace portions of BPF program
/// execution. As a libbpf-rs user, you must keep the `Object` alive during the entire lifetime
/// of your interaction with anything inside the `Object`.
///
/// Note that this is an explanation of the motivation -- Rust's lifetime system should already be
/// enforcing this invariant.

This makes complete sense. By only allowing references we can ensure that i.e. maps do not outlive the BPF object. However, it also disallows a lot of API designs and removes flexibility.

Aya for example also uses a similar approach by default. However, Aya has functions like take_map that return owned maps. Logically this transfers ownership from the Bpf Object to the caller, who is now responsible for managing the map's lifetime: Once the map struct is dropped, the map is gone forever. This allows for complex programs, where different modules can be in charge of a given map, instead of centralizing everything around a single Bpf Object.

anakryiko commented 1 year ago

I don't think it's good to really transfer ownership of map or prog out of bpf_object itself, because that goes strictly against ownership model in libbpf. But taking https://github.com/libbpf/libbpf-rs/pull/441 into account and this issue, I think it might make sense to have some sort of lightweight "map view/handle" (and similarly for program) that would be disassociated from Object itself and would contain FD and some minimal information internally to allow to

If both these handle structs and real Object's map/prog could implement the same trait for the above operations, you could use both real map/progs and such handles interchangeably for some generic code.

Similarly for https://github.com/libbpf/libbpf-rs/pull/441, instead of pretending to reconstruct full-fledged OpenMap and OpenProgram, would it be safer (from future API evolution) to only allow to reconstruct map/prog handle from ID, not OpenMap/OpenProgram?

@danielocfb thoughts?

danielocfb commented 1 year ago

However, on a larger codebase, I'm working with a number of maps and I possibly want references to multiple maps at the same time. Due to Rusts borrow checker, this is impossible with libbpf-rs: I cannot call map_mut() twice (without dropping the map reference first), since that would require two mutable borrows of the Object.

Almost no operations on a Map require a mutable self reference. Would it be possible to use Object::map instead of Object::map_mut? Or do you specifically require pinning?

danielocfb commented 1 year ago

I don't think it's good to really transfer ownership of map or prog out of bpf_object itself, because that goes strictly against ownership model in libbpf. But taking #441 into account and this issue, I think it might make sense to have some sort of lightweight "map view/handle" (and similarly for program) that would be disassociated from Object itself and would contain FD and some minimal information internally to allow to

  • for maps, do operations like lookup, update, delete, etc
  • for progs, attachment primarily, but if there is anything else interesting, that's fine as well.

If both these handle structs and real Object's map/prog could implement the same trait for the above operations, you could use both real map/progs and such handles interchangeably for some generic code.

It honestly seems over engineered to me. If the proposed lightweight objects don't carry around a reference to the "original" Object instance, then what we end up creating is, from the sounds of it, a version of std::rc::Weak that is specific to Map. All methods would now be carrying an additional error path for when the original object got dropped. And we'd have a bunch of traits floating around.

If the above suggestion of using Object::map is insufficient, then can you please clarify why that is exactly? Currently we are talking about references as the problem, but really what I am reading is that exclusivity is. So we are mixing two concerns and it is not exactly clear to me which is which.

If references themselves are the problem, then perhaps we can think about storing Rc objects directly inside Object. But let's cross that bridge if everything else fails.

I suspect Program is a bit different than Map and I haven't checked whether it would be sane to remove the &mut self requirement there, as we did for Maps.

GermanCoding commented 1 year ago

However, on a larger codebase, I'm working with a number of maps and I possibly want references to multiple maps at the same time. Due to Rusts borrow checker, this is impossible with libbpf-rs: I cannot call map_mut() twice (without dropping the map reference first), since that would require two mutable borrows of the Object.

Almost no operations on a Map require a mutable self reference. So why not just use Object::map instead of Object::map_mut?

In my mental model I had of Map, I had mistakenly assumed that update() operations required &mut self. I just verified this and realized that this isn't actually the case, sorry. I'm mostly interested in lookup/update/delete, so indeed I could probably live with just taking an immutable reference.

By only taking immutable references, I can indeed have multiple maps floating around at the same time. That's something I hadn't considered before. This does seem to solve the issue of exclusivity, at least for the use case I'm looking for. I will have to look at this more closely tomorrow though, so far I only had a quick look.

As for the references, I'm concerned that I have to assure the correct lifetimes of the map references. I will have to check whether I can live with that. My main concern here is that if I want to fully modularize a program where each module is managing/using a map or two, it can get difficult to track the lifetime of the references. I've walked down a similar path before and ended up with named lifetimes everywhere, which got more painful the more modules were involved.

You're basically forced into a pattern where you have to use a central dispatcher routine - which holds the main Object - and all references are created from within that dispatcher routine. If your modules now throw async futures or sometimes just closures into the mix, you quickly overwhelm the capabilities of the borrow checker.

It honestly seems over engineered to me. If the proposed lightweight objects don't carry around a reference to the "original" Object instance, then what we end up creating is, from the sounds of it, a version of std::rc::Weak that is specific to Map. All methods would now be carrying an additional error path for when the original object got dropped. And we'd have a bunch of traits floating around.

I personally found the idea interesting. The extra error path is certainly a bit annoying, but that's the price to pay if the borrow checker isn't capable of tracking the reference lifetimes by itself. As for the traits, I'm mostly thinking about a handle for Map and probably for Program, so it doesn't seem to convoluted for me at first glance.

anakryiko commented 1 year ago

We discussed this offline with @danielocfb. My main concern was around possibility to construct Map using above referenced libbpf_rs::Map::from_map_id(object.info()?.info.id)?. This would deviate from libbpf's way of doing things a lot, because libbpf's struct bpf_map cannot be constructed from map ID, as it contains extra information not necessarily "recoverable" from kernel. Additionally, both bpf_map/bpf_program have internal reference to bpf_object, which is another reason bpf_map/bpf_program can't be created in isolation from bpf_object.

For the original issue here, I agree that non-mutable reference to Object's Map should help.

danielocfb commented 1 year ago

As for the references, I'm concerned that I have to assure the correct lifetimes of the map references. I will have to check whether I can live with that. My main concern here is that if I want to fully modularize a program where each module is managing/using a map or two, it can get difficult to track the lifetime of the references. I've walked down a similar path before and ended up with named lifetimes everywhere, which got more painful the more modules were involved.

You're basically forced into a pattern where you have to use a central dispatcher routine - which holds the main Object - and all references are created from within that dispatcher routine. If your modules now throw async futures or sometimes just closures into the mix, you quickly overwhelm the capabilities of the borrow checker.

Yes, I agree, it can be a problem. I'll let you check first whether it's one for you, though.

It honestly seems over engineered to me. If the proposed lightweight objects don't carry around a reference to the "original" Object instance, then what we end up creating is, from the sounds of it, a version of std::rc::Weak that is specific to Map. All methods would now be carrying an additional error path for when the original object got dropped. And we'd have a bunch of traits floating around.

I personally found the idea interesting. The extra error path is certainly a bit annoying, but that's the price to pay if the borrow checker isn't capable of tracking the reference lifetimes by itself. As for the traits, I'm mostly thinking about a handle for Map and probably for Program, so it doesn't seem to convoluted for me at first glance.

I think there certainly could be some merit, but right now I think of such handles as a last resort. A cleaner solution to me, would be to hand out Weak references directly (and storing Rc<Map>s inside Object. But there the problem may end up being that we'd have to potentially differentiate between allowing mutable and immutable access: and in the mutable case what do we do...so that could be a forcing factor to go down the handle route.

danielocfb commented 1 year ago

It honestly seems over engineered to me. If the proposed lightweight objects don't carry around a reference to the "original" Object instance, then what we end up creating is, from the sounds of it, a version of std::rc::Weak that is specific to Map. All methods would now be carrying an additional error path for when the original object got dropped. And we'd have a bunch of traits floating around.

I personally found the idea interesting. The extra error path is certainly a bit annoying, but that's the price to pay if the borrow checker isn't capable of tracking the reference lifetimes by itself. As for the traits, I'm mostly thinking about a handle for Map and probably for Program, so it doesn't seem to convoluted for me at first glance.

I think there certainly could be some merit, but right now I think of such handles as a last resort. A cleaner solution to me, would be to hand out Weak references directly (and storing Rc<Map>s inside Object. But there the problem may end up being that we'd have to potentially differentiate between allowing mutable and immutable access: and in the mutable case what do we do...so that could be a forcing factor to go down the handle route.

Okay, so I just discussed this option again with @anakryiko . And basically the semantics would be different than what I expected them to be: the handle would hold a reference to the map and the corresponding object could be closed, while the handle would still keep working. So conceptually there would be no additional error path. Given that, I agree that may be preferable to handing out Weak objects. But again, let's see if references are an actual problem here before we go down this road, as it will still add a fair amount of complexity in the form of yet another Map type and a trait (+ corresponding functionality for Programs, perhaps).

GermanCoding commented 1 year ago

I had a closer look again at this today and tried out a few things. Having all maps as immutable definitely helps, even though carrying named lifetimes around everywhere is all but pretty.

Definite problems appeared once futures were involved. There are two primary things that make this impossible:

Okay, so I just discussed this option again with @anakryiko . And basically the semantics would be different than what I expected them to be: the handle would hold a reference to the map and the corresponding object could be closed, while the handle would still keep working. So conceptually there would be no additional error path. Given that, I agree that may be preferable to handing out Weak objects.

I'm not sure I fully understand the proposal in that case. The handles would themselves be owned objects that point to a map (via FD?) and ensure that the map is kept alive, i.e. the map can outlive the object?

danielocfb commented 1 year ago
  • Currently, the NonNull<libbpf_sys::bpf_map> appears to not implement Send. This means that they cannot be send between threads, so using them across futures cannot work. I've read *mut bpf_object cannot be sent between threads safely #245 and figured that it is fine for Map to implement Send, so I went ahead and used a wrapper struct that implements Send for Map.

I don't know if that's the definite conclusion to draw form this discussion, but we should most certainly look into whether Map can be made Send. Feel free to open a PR if you are interested in working on that.

Okay, so I just discussed this option again with @anakryiko . And basically the semantics would be different than what I expected them to be: the handle would hold a reference to the map and the corresponding object could be closed, while the handle would still keep working. So conceptually there would be no additional error path. Given that, I agree that may be preferable to handing out Weak objects.

I'm not sure I fully understand the proposal in that case. The handles would themselves be owned objects that point to a map (via FD?) and ensure that the map is kept alive, i.e. the map can outlive the object?

As I understand it, the handle would mostly be a file descriptor (fd). But because the fd is dup'ed the kernel keeps the corresponding object (the map) alive until all references are released. That's the case even if the BPF object is released. So it's not about language properties and reference counting at that level, but more of a system guarantee if you will.

danielocfb commented 1 year ago
  • Even if Map implements Send, references may never escape their function body, so moving them into futures is not allowed

Okay, so perhaps we should consider a rework of the map APIs altogether. In the larger scheme of things, we already have some implementation details that are not particularly clean. Specifically, we capture both the bpf_map-present and bpf_map-not-present use case internally: https://github.com/libbpf/libbpf-rs/blob/208d02e05b1ea436454f270b67ee01f32a629ae8/libbpf-rs/src/map.rs#L213-L214

That was done in response to https://github.com/libbpf/libbpf-rs/issues/191.

The initially proposed MapSpec type in there almost sounds a bit like the "handle" we were discussing here (although it's not bound to an FD, actually).

I feel like we should capture all the use cases that we need to support and then we can come up with an API design. We should also figure out what bits of information we can provide on which creation path. E.g., @anakryiko mentioned that we would not necessarily have BTF information (among other details) when creating from, say, an ID. So that's relevant for the from_map_id constructor that was added by @mendess:

https://github.com/libbpf/libbpf-rs/blob/208d02e05b1ea436454f270b67ee01f32a629ae8/libbpf-rs/src/map.rs#L275-L276

as part of https://github.com/libbpf/libbpf-rs/commit/921de0303fb456f02ba8b570f15537ea902a97bb. We should decide whether from the libbpf-rs side of things, the lack of such details matters much and justifies a dedicated type, or whether it makes more sense to just expose them as optional (Option) fields/return values.

GermanCoding commented 1 year ago

I don't know if that's the definite conclusion to draw form this discussion, but we should most certainly look into whether Map can be made Send. Feel free to open a PR if you are interested in working on that.

I had a look into this, even though I'm not sure there's any benefit to it when you can't actually own a map: Moving a reference (between threads) requires both Send and Sync (according to the Rust book, a reference that is Send is equivalent to the type being Sync), and we most likely cannot guarantee Sync without very careful considerations. Implementing Send on its own appears to be most useful if the type is owned.

It also appears that bpf_map shares state with bpf_object within libbpf. This can be a problem for Send, since there must be no mutable shared data for a type to be safely Send. The shared state seems to be only used for checking whether an object is already loaded (during bpf_map__set_max_entries and bpf_map__set_autocreate), though it feels unwise to rely on current behavior to be guaranteed (unless @anakryiko confirms that this is guaranteed of course). For references the borrow checker would solve this, since an (im)mutable map reference implies that the corresponding object may not be mutated at the same time. This in turn is actually kinda interesting and makes me wonder whether it would be possible to implement Send on &Map.

Implementing Send for Object is probably more interesting initially, since that type can actually be owned. Let me know if I should open a separate issue for this.

Okay, so perhaps we should consider a rework of the map APIs altogether. In the larger scheme of things, we already have some implementation details that are not particularly clean. Specifically, we capture both the bpf_map-present and bpf_map-not-present use case internally

Looking at the libbpf internals, for threading/futures it would probably be a lot easier if we did not have a bpf_map, since that's what causes the Send + Sync troubles. This in turn makes me like the idea of having just a FD, as that immediately eliminates that concern. Of course not all operations can be done with just a FD, thus coming back to the idea of the "limited" handle already proposed by @anakryiko. The bpf_map not present scenario could be designed so that you just have the limited handle (as we only have some FD to begin with, right?`) and no "full" map bound to an object.

mendess commented 1 year ago

Looking at the libbpf internals, for threading/futures it would probably be a lot easier if we did not have a bpf_map, since that's what causes the Send + Sync troubles.

@GermanCoding this is not strictly true, there is nothing about inherently not thread safe abour raw pointers, they are marked as !Send + !Sync more as a lint. So removing the raw pointer and using just the fd is equivalent in terms of safety to doing unsafe impl Send for Map {}. The rule for whether you can mark a type as Send and Sync is if every method on that type obeys the rules for Send and Sync, i.e. if every bit of the types implementation details are thread safe. Using a RawFd or a raw pointer makes little difference in that regard.

On that note though, I've actually done the work of looking at every C (libbpf) method that is called inside every method in Map and they all seem thread safe, the one thing that is possibly lacking would be confirmation from the C library that they do guarantee that thread safety in their docs.

mendess commented 1 year ago

@GermanCoding On the topic of Map not being Send and Sync, it is cumbersome but you can work around it with tokio::task::LocalSet or futures::stream::FutureUnordered for a dynamic amount of futures and tokio::select for a static amount of futures.

GermanCoding commented 1 year ago

this is not strictly true, there is nothing about inherently not thread safe abour raw pointers

I know, but The Rustonomicon states:

Something can safely be Send unless it shares mutable state with something else without enforcing exclusive access to it.

In this case, the ptr references a bpf_map struct which in turn shares state with bpf_object (which is pointed to by my Object). This makes this initially not safe for Send: Imagine I had an owned Map instance somehow, that's marked Send but not Sync. I could now move my Map onto a different thread than Object and then mutate state on Object and Map at the same time. The Rust compiler assumes that those two objects do not share state, but they actually do - internally, libbpf cross-references these objects. Now I have a potential data race (potential, because in practice I couldn't find an obvious way to exploit this - the amount of shared state is fairly small).

There's a special case here, because we cannot actually own Maps, and references are protected anyway. But that's another topic actually, because unsafe impl Send for Map would have to upheld not only for references. unsafe impl Send for &Map is another story.

So removing the raw pointer and using just the fd is equivalent in terms of safety to doing unsafe impl Send for Map {}

No, because by removing the bpf_map pointer I also remove the shared state to bpf_object, so there's nothing I could race with (except in-kernel stuff, but that's not for userspace to consider).

mendess commented 1 year ago

You are correct, but there is something else that can stop Map from being Send and Sync. It must also:

The fact that it shares state with bpf_object doesn't actually matter, provided that mutability is respected, if &mut self is required for any method that could potentially mutate and map and/or object, then it's fine to not think about that detail when determining whether it should be Send or Sync. Iirc no method in map mutates the associated object.

GermanCoding commented 1 year ago

So I found some time to have a look at this again today.

Okay, so perhaps we should consider a rework of the map APIs altogether. In the larger scheme of things, we already have some implementation details that are not particularly clean. Specifically, we capture both the bpf_map-present and bpf_map-not-present use case internally:

I also noticed this. The current API internally distinguishes "map created from userspace - no bpf_map pointer" and "map created by libbpf - have bpf_map pointer". This creates some weird codepaths that are absolutely not obvious from a user perspective.

For example, it appears that I can pin Maps created from userspace. The code in question just seems to delegate to the kernel (via bpf_obj_pin):

pub fn pin(...) {
[...]
        let ret = match self.ptr {
            Some(ptr) => unsafe { libbpf_sys::bpf_map__pin(ptr.as_ptr(), path_ptr) },
            None => unsafe { libbpf_sys::bpf_obj_pin(self.fd.as_raw_fd(), path_ptr) },
        };
[...]
    }

However, if the user then proceeds with calling is_pinned, it starts getting weird:

pub fn is_pinned(..) {
        match self.ptr {
            Some(ptr) => Ok(unsafe { libbpf_sys::bpf_map__is_pinned(ptr.as_ptr()) }),
            None => Err(Error::InvalidInput(("No map pointer found").to_string())),
        }
    }

Same for get_pin_path():

pub fn get_pin_path(...) {
        let path_ptr = match self.ptr {
            Some(ptr) => unsafe { libbpf_sys::bpf_map__pin_path(ptr.as_ptr()) },
            None => return None,
        };
       [...]

So I can pin a userspace-created map, but then trying to query information about it won't work. The error handling also feels a bit inconsistent, sometimes returning Results and sometimes Options.

This is obviously an effect that comes from the fact that we don't have a bpf_map, so we can't properly do these operations. However, this isn't really exposed from the outside: When I create a map from userspace, externally it looks like I have a "normal" Map but for internal reasons some things work while others don't.

For me, this sounds like this difference is severe enough to warrant distinct types for libbpf-created and "other" maps. This way we can explicitly signal what things are supported for the given Map and what are not.

My own use case also comes into play here: The "other" map type (those without a bpf_map) all seem to work with functionality provided by libbpf/src/bpf.c. Most of the functions here (reading/writing/deleting map data, attaching programs, pinning maps etc) look like fairly tight wrappers around syscalls. This means that they typically don't touch any global or shared state and as such all look nicely threadsafe. In addition, they all operate on FDs (which can be dup'ed), so a type based around these functions could (technically) be both Cloneable and Send + Sync. For me this would be absolutely ideal.

danielocfb commented 1 year ago

Indeed, that is not particularly great behavior. Are you interested in prototyping "handle" suggestion, @GermanCoding? I feel like having something concrete to talk about and iterate on may be best to move things forward.

GermanCoding commented 1 year ago

Yes, I would love to do that. I will need a few days time though, as I can only work on this part time. I would then open a PR and we continue discussing there?

danielocfb commented 1 year ago

Yes, I would love to do that. I will need a few days time though, as I can only work on this part time. I would then open a PR and we continue discussing there?

Yes, that would be great. No rush!

danielocfb commented 1 year ago

This issue should now have been addressed with https://github.com/libbpf/libbpf-rs/pull/473 merged. Closing.