eunomia-bpf / bpftime

Userspace eBPF runtime for Observability, Network & General Extensions Framework
https://eunomia.dev/bpftime/
MIT License
791 stars 74 forks source link

[BUG] Ringbuf bpf helper API does not work in simple examples #195

Closed agentzh closed 7 months ago

agentzh commented 8 months ago

Strangely, in many simple cases, bpf_ringbuf_reserve() returns NULL. I enabled the SPDLOG_LEVEL=Debug env and failed to see any meaningful log messages.

And even when the bpf_ringbuf_reserve() calls succeed, the tracer program's epoll loop fails to receive any new events or messages from the ring buffer. Not a single message.

I'm using the LLVM JIT mode of bpftime.

agentzh commented 8 months ago

OK, it seems it is because bpf_ringbuf_reserve() returns NULL when it runs out of space. The default behavior does not look like a "ring buffer" at all. It should automatically rotate and discard the oldest unread messages?

But the more severe problem here is that the tracer/server side can never read anything out of the ring buffer even though the poll loop is busy polling.

agentzh commented 8 months ago

The total ring buf is 8*4096 bytes, while a single element is 4097 bytes. So after reserving & submitting 7 elements into the ring buf, it starts to return NULL. And the tracer/server never receives any messages at all even though the 7 elements were inserted with 200ms in between.

agentzh commented 8 months ago

By using smaller element size like 128 bytes, I'm no longer seeing NULL returned but the tracer/server still never picks up any elements from its ring_buffer__poll() calls.

agentzh commented 8 months ago

OK, seems like this is because bpftime's epoll emulation fails to implement the -1 timeout argument correctly. In this case it just returns immediately without checking any events. And this is why the ringbuf is never consumed by the server or tracer.

yunwei37 commented 8 months ago

It's fixed in #201

Maybe we should use the kernel bpf test suit to avoid future problems.