axboe / liburing

Library providing helpers for the Linux kernel io_uring support
MIT License
2.72k stars 393 forks source link

Question: behavior of `io_uring_submit_and_wait` when `IOPOLL` set #1107

Closed easypickings closed 2 months ago

easypickings commented 4 months ago

By navigating the source codes, I find the function io_uring_submit_and_wait will call __io_uring_submit, which ends up with the syscall io_uring_enter:

https://github.com/axboe/liburing/blob/f4e42a515cd78c8c9cac2be14222834be5f8df2b/src/queue.c#L368-L388

By the man page of io_uring_enter, it says

       If the io_uring instance was configured for polling, by specifying IOR‐
       ING_SETUP_IOPOLL  in  the  call to io_uring_setup(2), then min_complete
       has a slightly different meaning.  Passing a value of 0  instructs  the
       kernel  to return any events which are already complete, without block‐
       ing.  If min_complete is a non-zero value, the kernel will still return
       immediately  if  any completion events are available.  If no event com‐
       pletions are available, then the call will poll  either  until  one  or
       more  completions  become  available, or until the process has exceeded
       its scheduler time slice.

So if I understand it right: when I create a ring with IOPOLL, setup N SQEs, and then call io_uring_submit_and_wait(&ring, N), is it possible that the call returns with less than N CQEs available?

krisman commented 4 months ago

So if I understand it right: when I create a ring with IOPOLL, setup N SQEs, and then call io_uring_submit_and_wait(&ring, N), is it possible that the call returns with less than N CQEs available?

Looking at the kernel source, yes. It can return with less CQEs and you need to be able to catch that.

isilence commented 2 months ago

Just as @krisman said, yes it can return fewer entries than asked for, and the userspace should be able to handle it. It's also not IOPOLL specific behaviour, well may happen with normal rings.