Open itaysk opened 3 years ago
from #80
It is well known that cgo has bad performance when calling c code, and even worse when calling go callbacks from c (see, for example, https://about.sourcegraph.com/go/gophercon-2018-adventures-in-cgo-performance/). This is actually not a problem for most of the use cases of libbpfgo, where we just need to load a program and attach it, or update a map, as these operations are not that frequent. It may become a problem, however, when we need to poll for events coming from the kernel through one of the perf/ring buffers, as these are much more frequent. Suggestion: for buffers polling (either perf or ring buffers) let's implement the logic in pure go. These functions can be added as an alternative API for the already existing functions, and will offer high performance where needed. Specifically, these are the libbpf functions that we will need to implement: C.perf_buffer__poll() and C.ring_buffer_poll() (both are called by the PerfBuffer/RingBuffer poll() function)
After performing some local tests, it seems that cgo is not the bottleneck of tracee, but the printer. Below is a pprof output where I used a very noisy event (sched_switch) and the gob printer. The conclusion is the same for other workloads (e.g. using default event set on an idle system) and other printers (table/json) - the printer path is always worse than that of cgo.
With table printer:
What does this mean for this issue? Isn't it still something we should do?
I still need to find a way to compare a prototype I have with pure go implementation to the current cgo implementation. With pprof I can only see the bottleneck, but can't quantitatively compare between the two implementation. Any suggestion for how to do that?
The performance of c to go calls has improved in recent go versions: https://github.com/golang/go/issues/42469#issuecomment-747741061
If there is no strong evidence that this is still an issue for libbpfgo, we may probably close this one for now
The performance of c to go calls has improved in recent go versions: golang/go#42469 (comment)
If there is no strong evidence that this is still an issue for libbpfgo, we may probably close this one for now
It's also important to note that since we also pass pointers around in our cgo code, that can be much more safely and efficiently done using Cgo Handles as of go 1.17. https://pkg.go.dev/runtime/cgo#Handle
Hey @yanivagman I saw issue #80 and wondered what did was your plan there? By implementing the polling in pure go you mean just call the epoll_wait from go instead of cgo? Or did you talk about implementing more function inside the perf_buffer_poll? In my project Im handling performance issues that are caused mainly by the perf_buffer_poll cgo implemention and Im trying to find a solution
As i understand, unless we can somehow trigger the perf callback in pure go there is still going to be a massive cpu consumptions as c-to-go is the most expensive directive in cgo
Hey @yanivagman I saw issue #80 and wondered what did was your plan there? By implementing the polling in pure go you mean just call the epoll_wait from go instead of cgo? Or did you talk about implementing more function inside the perf_buffer_poll? In my project Im handling performance issues that are caused mainly by the perf_buffer_poll cgo implemention and Im trying to find a solution
As i understand, unless we can somehow trigger the perf callback in pure go there is still going to be a massive cpu consumptions as c-to-go is the most expensive directive in cgo
Hi @guyarb, To implement the polling in pure go, the following changes are required:
A reference implementation is Cilium's ebpf library (that is written in pure go). When playing with this code and trying to measure differences using pprof, I didn't see major improvements, as described above. It might be that pprof is not the right tool for this task.
In your project, how did you find that the performance issues were caused mainly by the cgo perf_buffer_poll call?
To improve performance, we can bypass libbpf and cgo in the critical path (Libbpf callback)