axboe / liburing

Library providing helpers for the Linux kernel io_uring support
MIT License
2.77k stars 398 forks source link

Is there a way to synchronously wait for a completion of a submitted write for a particular FD #1186

Closed vsolontsov-ll closed 1 month ago

vsolontsov-ll commented 1 month ago

I'm trying to migrate an app from sync writing (dumping) received data from a separate thread to async writing via io_uring/liburing.

So basically there's a "reactor" thread looping on the io_uring. As a one of the aspects it collects and buffers some data in a ring buffer. The buffed data is dumped to a file from another thread (by size or time). The ring buffer is bounded, so once there's not enough space in it, the reactor thread can easily get blocked e.g. on a cond-var (the dumping thread can signal the cond-var).

I would like to preserve this behavior -- block the reactor thread until the in-fly operation pushed into the io_uring is done. The app can't process other completion events as it has no space to sore the results. Continuing with polling the io_uring and collecting CQEs somewhere aside until getting the desired one doesn't look handy (there could be IORING_POLL_ADD_MULTI operations and alike so it's unpredictable how many events may complete).

The only solution I can think of by the moment is adding a separate io_uring for such operation, add its fd to the main one with IORING_OP_POLL_ADD , share the working queue (IORING_SETUP_ATTACH_WQ) with the main one, but use io_uring_wait_cqe() on this additional uring when needed. But it looks a bit cumbersome.

Could you probably suggest something nicer?

isilence commented 1 month ago

Not supported, and I don't believe it can be sanely done. It's just not feasible adding a side channel for completions, the use case is narrow and can be dealt in the user space.

If you intend to leave completions piling up anyway, which might be not a good idea, you can just continue waiting after inspecting all new CQE but not consuming them. Pseudo code, can be made nicer.

while (1) {
    nr = cqes_read();
    io_uring_wait(nr + 1);
    for_each_cqe() {
        if (i < nr) continue;
        if (inspect_cqe(cqe)) break;
    }
}
vsolontsov-ll commented 1 month ago

Thanks a lot for your answer and for confirming. I had a weak hope that it might be possible as a side effect of some synchronization -- unregistering the FD or the write buffer, closing the fd (#783) with in-fly operation(s).

I had in mind an approach similar to what you proposed but with moving picked CQEs to an internal unbounded queue (because the completion queue may get overflown and we will never get our unblocking event). I didn't feel comfortable about it, but probably it's not that bad considering the IORING_FEAT_NODROP feature, which I assume does a similar thing on the other end.

Anyway, many thanks for your help. Let close this "issue" unless you get an idea how to [miss]use an existing syscall to achieve this effect.