Open heinrich5991 opened 2 years ago
Because they contain raw pointers, they're automatically marked as non-Sendable and non-Synchronized, but is it actually the case that one cannot use the data structures from multiple threads?
I think it's send-able (able to be used by 2 threads at different times) but not sync-able (able to be used by 2 threads at the same time). @anakryiko does this sound accurate to you?
What about struct Map
directly ? Could it be possible to have that struct marked as Sendable & Syncable ?
I don't think accessing maps at the same time in 2 different threads is bad at all (correct me if I'm wrong).
I guess you will just have a different result if an entry doesn't exist anymore for example.
I think it would be helpful to have some kind of overview what synchronization assumptions libbpf C types make, but I couldn't find all that much information about that in the libbpf documentation: https://libbpf.readthedocs.io/en/latest/search.html?q=thread-safe&check_keywords=yes&area=default
If we assume that everything that is not thread safe is marked as such (and everything with no mention of thread-safety is safe to be used), we could at least add Sync
bounds. Send
would likely depend on whether a type uses thread local storage in one shape or another, or similar.
@anakryiko, do you know if the above assumption is correct? Also, and I am happy to bring this up more publicly in libbpf itself, perhaps it would be helpful to highlight these properties in terms of "POSIX Safety Concepts" more prominently, as seems to be a common thing for many POSIX functions (and beyond?)?
Okay, I wanted to change this too, so I did some digging. Turns out the answer is: it's complicated.
The goal would be to prove this is okay:
unsafe impl Send for Object {}
unsafe impl Sync for Object {}
for Object
for to be Send
/Sync
all it's members have to be Send
/Sync
. So I went field by field.
ptr: NonNull<libbpf_sys::bpf_object>
This field is only used in the drop implementation, reading the code I could figure out that it doesn't touch global state and seems pretty safe.
maps: HashMap<String, Map>
For this field we need to see if Map
can be marked as Send
/Sync
. Luckily Map
only has one field ptr: NonNull<bpf_map>
, so we just have to check that.
After reading through a lot of source code of libbpf I think this one is also pretty okay.
progs: HashMap<String, Program>
Now here is where it comes crumbling down, almost every attach function is either too hard to follow or does touch or depend on global state somewhere, the worst problem being this one where an int with static lifetime is mutated without synchronization.
So all of Program
is actually thread safe except some of the attach functions. I haven't read all of them though.
One final overarching point that I omitted though that's quite important is that I ignored the side effects of libbpf-rs::set_print
, this function should be marked unsafe as it mutates a global variable that almost all of the libbpf api reads. All of the assumptions that libbpf_sys::bpf_object
and Map
are thread safe hinge on the fact that libbpf_rs::set_print
is never called concurrently with other methods from these types.
I spoke with anakryiko (the libbpf
maintainer). The basic stance seems to be: libbpf
is not thread safe. Certain APIs likely are, if, say, they are just thin wrappers around system calls (the kernel always has to do the right thing in the presence of multiple threads). That being said, we haven't identified any blockers for sending objects between threads. Certainly there was no mentioning of thread local data being used. So in a nutshell: most if not all entities should be Send
. No comment on Sync
. I would focus on making things Send
first, as I would think that it's likely the bigger blocker for users. Every violation of Send
on the libbpf
side then is presumably a bug and should be fixed. Regarding Sync
: users should be able to build synchronization on top and then share, if that truly is necessary.
progs: HashMap<String, Program>
Now here is where it comes crumbling down, almost every attach function is either too hard to follow or does touch or depend on global state somewhere, the worst problem being this one where an int with static lifetime is mutated without synchronization. So all ofProgram
is actually thread safe except some of the attach functions. I haven't read all of them though.One final overarching point that I omitted though that's quite important is that I ignored the side effects of
libbpf-rs::set_print
, this function should be marked unsafe as it mutates a global variable that almost all of the libbpf api reads. All of the assumptions thatlibbpf_sys::bpf_object
andMap
are thread safe hinge on the fact thatlibbpf_rs::set_print
is never called concurrently with other methods from these types.
So Andrii's stance seems to be that aligned pointer reads and writes are always atomic and so there is no issue. Personally, I think it should be an explicit atomic read and write, but I guess whatever. That reasoning would then presumably also apply to use_debugfs()
. Nevertheless, if functionality is not explicitly green lit as being thread safe I'd be reluctant to expose it as Sync
just because it currently is. That just invites subtle issues down the line.
From what I could gather, there are two meanings for "atomic" which creates ambiguity and confusion. Aligned pointer reads and writes are atomic in the sense that no thread will ever observe a "torn value" (half the bit pattern changed but the other half has not changed yet), it just writes the new value at once. It doesn't however mean that the caches of other processors will be invalidated such that they will see the new value any time soon, which effectively means that
static int i = -1;
if (i < 0) {
i = some_check() == 0;
return i == 1;
does not prevent some_check
from being executed more than once because there are no memory barriers here telling the CPU that it should flush this value to main memory and invalidate the caches of other processors such that they see the new value.
If writes to int
were truly atomic, why would __atomic_fetch_add
exist for them?
This blog post explained it pretty well
I agree though that trying to make it Send
first might be a good idea. I can try that. One thing that bothers me though is that a user of the library can easily circumvent everything by just using ObjectBuilder
twice, which will yield two identical objects, it's as if Object
had Clone
implemented.
From what I could gather, there are two meanings for "atomic" which creates ambiguity and confusion.
Yeah, the two are often mixed together. However, I believe it is a property of the higher level logic built on top whether when another CPU sees a value as being updated has a bearing on correctness. That's different from the correctness aspect that is whether it can only see coherent values.
In our case, I argue, the former does not matter. Clients/users cannot have any expectation of a set_print
on one CPU coming into effect at a specific point in time on another, but it will happen eventually. That is fine. (I did not check but would assume the same applies to has_debugfs
; this certainly is a matter of case-by-case evaluation)
If writes to
int
were truly atomic, why would__atomic_fetch_add
exist for them?
To
1) make it explicit and guaranteed that reads & writes are atomic (as opposed to the hand-wavy "on most architectures that we can think of and support we believe that aligned reads & writes of machine sized words are atomic" that we currently employ in libbpf
; and again, I am absolutely in favor of making those atomic operations explicit, but I am not fighting that fight) and
2) because memory orderings are often brushed off as black magic (or even something that developers should not care about). atomic_fetch_add
just defaults to the strongest memory ordering (sequential consistency) because it just does the right thing (potentially at the cost of performance). That's why there is atomic_fetch_add_explicit
, which allows you to specify the memory ordering explicitly. See https://en.cppreference.com/w/c/atomic/atomic_fetch_add
(__atomic_fetch_add
[the two underscore version that you mention] is typically a compiler built-in and would require an explicit memory ordering, e.g., [0])
So in our case, the writes we are concerned about really would map to atomic_fetch_add_explicit
with relaxed memory ordering. And from what I recall, those calls precisely boil down to a mere aligned write, without additional barriers (I am lazy to search for a confirmation now, but from memory that is how at least two proprietary operations systems I am familiar with implement them).
This blog post explained it pretty well
To the degree I can tell, Go hides memory orderings from developers (see above). Furthermore, the package provides an abstraction that is guaranteed to ensure atomicity for various types, including those not being of the machine's native word size (such as Bool
[depending on what it maps to in the language]).
1. make it explicit and _guaranteed_ that reads & writes are atomic (as opposed to the hand-wavy "on most architectures that we can think of and support we believe that aligned reads & writes of machine sized words are atomic" that we currently employ in `libbpf`; and again, I am absolutely in favor of making those atomic operations explicit, but I am not fighting that fight) and
there is no need to fight a fight, someone just has to send a patch switching to atomic exchange. No one yet complained about this, so no one did the work.
As for the issue at hands. I agree with @danielocfb , these APIs are not unsafe, and shouldn't be forced onto users to use explicit unsafe blocks.
Is it intentional that structs like
WlistSkel
don't implementSend
andSync
? Because they contain raw pointers, they're automatically marked as non-Send
able and non-Sync
hronized, but is it actually the case that one cannot use the data structures from multiple threads?I'm trying to build a HTTP API around libbpf, and
warp
wants to use multiple threads to handle HTTP requests, thus the types would need to be shareable between threads. Otherwise, I'd have to spawn a separate thread that serializes and handles all the communication with libbpf, which would be less efficient.Errors look like this: