Closed keithpl closed 3 weeks ago
Not a maintainer so take this with a grain of salt.
From a quick glance at the source it looks like the only affect of nonblocking read is that it will return -EAGAIN immediately if the data is not ready and that most control flows will still perform nonblocking operations.
(Standard call): io_uring_event -> io_submit_sqes: https://github.com/torvalds/linux/blob/master/fs/io_uring.c#L9214 (SQPOLL mode): io_sqe_thread -> io_submit_sqes: https://github.com/torvalds/linux/blob/master/fs/io_uring.c#L6914
io_submit_sqes -> io_submit_sqe: https://github.com/torvalds/linux/blob/master/fs/io_uring.c#L6875
io_submit_sqe -> io_queue_sqe: https://github.com/torvalds/linux/blob/master/fs/io_uring.c#L6627 / https://github.com/torvalds/linux/blob/master/fs/io_uring.c#L6612
io_queue_sqe -> io_issue_sqe (with force_nonblock = True): https://github.com/torvalds/linux/blob/master/fs/io_uring.c#L6488
io_issue_sqe -> io_read(..., force_nonblock, ...): https://github.com/torvalds/linux/blob/master/fs/io_uring.c#L6189
io_read will do nonblocking operation: https://github.com/torvalds/linux/blob/master/fs/io_uring.c#L3463
Only thing you are getting with O_NONBLOCK when submitting read events is return of -EAGAIN when data is not ready: https://github.com/torvalds/linux/blob/master/fs/io_uring.c#L3487
The only call to io_issue_sqe with force_nonblock = False is from io_wq_submit_work: https://github.com/torvalds/linux/blob/master/fs/io_uring.c#L6326 which is stored a function ptr in a struct: https://github.com/torvalds/linux/blob/master/fs/io_uring.c#L7990. I don't see that function pointer ever invoked / used so not sure what control flow would lead to io_issue_sqe with force_nonblock = False.
Tested this (ish) by submitting blocking socket reads in sqe_ring and second sqe could be handled first.
I'm currently running Linux 5.9.14, sorry if this has been addressed in >= 5.10.
With IORING_FEAT_FAST_POLL, recv() events can be submitted on non-blocking sockets and will not show up in the completion queue until there is a result that's not EAGAIN. However, submitting read() events against non-blocking sockets, without pending data to be read, immediately places the event in the completion queue with cqe->res as -EAGAIN. Would it be possible for read()/write() events to behave exactly the same as send()/recv() for sockets?