eunomia-bpf / bpftime

Userspace eBPF runtime for fast Uprobe & Syscall hook & Extensions with LLVM JIT
https://eunomia.dev/bpftime/
MIT License
699 stars 70 forks source link

runtime: Fix runtime epoll_wait error when timeout is set to -1 #201

Closed Zheaoli closed 6 months ago

Zheaoli commented 6 months ago

Description

Fixes #186

Type of change

How Has This Been Tested?

Test Configuration:

Checklist

Zheaoli commented 6 months ago

https://elixir.bootlin.com/linux/v6.5-rc2/source/fs/eventpoll.c#L1843

Like what we have already done in Linux kernel , we need to check the timeout value in epoll_wait emulation

BTW, I prefer to replace all the timer from loop check to select event based, WDYT @yunwei37 ?

yunwei37 commented 6 months ago

Thank you very much for your help!

BTW, I prefer to replace all the timer from loop check to select event based, WDYT

However, one problem may be that, if we want to have a better performance, we should avoid execute any syscall in the helpers of eBPF runtime. execute syscalls in the helpers will greatly slow down the runtime.

The ring buffer helpers work similar to iouring, maybe iouring can help, but if we rely on a feature only supports by higher version of Linux, the portable advantage of bpftime will be lost.

yunwei37 commented 6 months ago

I think it's a bad idea to mock epoll syscalls in bpftime like we did now(it maybe hard to get it correct), but I didn't find a better solution... maybe you can help us?

The goal of the solution is to mock the eBPF ring buffer completely in userspace.

The epoll in the ring buffer is used to notify the eBPF applications (eg. bpftrace) there are elements available in the ring buffer.

Maybe we can register some socket fd (make libbpf assume they are ring buffer maps fd) to epoll, and use them to notify the eBPF applications?

The notification may only happens if the ring buffer is full, so not every bpf helper call will bring a syscall.

yunwei37 commented 6 months ago

Or maybe we can simply set a flag in the shared memory to tell the eBPF runtime the ring buffer is empty, before the eBPF applications enter the select syscalls. So the eBPF runtime in another process can filled the selected fd.