sozu-proxy / sozu

Sōzu HTTP reverse proxy, configurable at runtime, fast and safe, built in Rust. It is awesome!
https://www.sozu.io/
GNU Affero General Public License v3.0
3.03k stars 187 forks source link

io_uring support #690

Open Geal opened 3 years ago

Geal commented 3 years ago

we need to plan for io_uring integration. It is a new Linux syscall that allows sending a queue of syscalls to execute. The kernel will then execute those calls (if possible) and return the results in another queue. This works a bit like IO completion ports on Windows. This design gives great performance gains, since for a serie of syscalls, we will not need to call from userland to kernel repeatedly, but just call once and let the kernel perform all of the actions. Submissions to the queue can be linked, so that they can be cancelled if one of them fails.

Example:

we can submit a lot of unrelated tasks, that may not be linked and can complete at various points in the future. There's a limit to the number of tasks though, and this will not replace the epoll based event loop. When we submit a read or write task using a buffer, that buffer must not be used from userland in the meantime.

Here's how it could work in practice:

This will require more separation between the IO part and the protocol part of sessions (which is already started with the readable_parse() method for HTTP), where if we want to do some IO, either we submit an operation and pass in "waiting for IO part" with io_uring, or do it directly with basic APIs, then when the result comes go to the next step where we handle the data.

Example usage:

ccarral commented 1 year ago

Hi, I am interested in contributing to this project and this issue caught my attention.

The way I see it from reading the issue, the polling for Interest::READABLE and Interest::WRITABLE events is still going to be handled by mio, so that leaves us with the following syscalls to be handled by uring:

But how would the linked read() -> write() know how many bytes were read/are to be written in the backend socket? Maybe I am missing something.