Closed romange closed 3 days ago
Tested on kernels 6.2 and 6.8.
Huh, they should behave the same - both pick a buffer upfront, and then just recycle it (or don't commit, if using provided buffer rings) if no data is available. So that's a bit surprising, it's literally the same code at that part of the issue chain, only after doing a recv does it become different. Both should return -ENOBUFS immediately.
You can probably work around this by setting IORING_RECVSEND_POLL_FIRST
if you know it's empty, but like mentioned above, it doesn't make a lot of sense to me. I'll poke a bit and see what I get here.
Test app:
#include <fcntl.h>
#include <stdint.h>
#include <liburing.h>
#include <netinet/in.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/socket.h>
#include <time.h>
#include <unistd.h>
int main(int argc, char *argv[])
{
struct io_uring_sqe *sqe;
struct io_uring_cqe *cqe;
struct io_uring ring;
char buffer[64];
int sockfd, ret;
io_uring_queue_init(8, &ring, 0);
sockfd = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
sqe = io_uring_get_sqe(&ring);
io_uring_prep_recv(sqe, sockfd, buffer, sizeof(buffer), 0);
sqe->flags |= IOSQE_BUFFER_SELECT;
sqe->buf_group = 5;
io_uring_submit(&ring);
ret = io_uring_wait_cqe(&ring, &cqe);
if (ret)
return ret;
printf("recv res=%d\n", cqe->res);
io_uring_cqe_seen(&ring, cqe);
sqe = io_uring_get_sqe(&ring);
io_uring_prep_recv_multishot(sqe, sockfd, NULL, 0, 0);
sqe->flags |= IOSQE_BUFFER_SELECT;
sqe->buf_group = 5;
io_uring_submit(&ring);
ret = io_uring_wait_cqe(&ring, &cqe);
if (ret)
return ret;
printf("multishot res=%d\n", cqe->res);
io_uring_cqe_seen(&ring, cqe);
return 0;
}
where we don't have a buffer group setup, so fail to pick a buffer, and the output:
axboe@m2max-kvm ~> ./test-recv
recv res=-105
multishot res=-105
So seems to follow what I outlined, but I'm curious if you'd see the same on your kernel running that. Because you really should.
Oh, maybe IORING_RECVSEND_POLL_FIRST
is the reason. For the bufselect recv I set it up with IORING_RECVSEND_POLL_FIRST
by default. With multishot I did not bother. Could it be the reason?
Yeah, now when you explained it, it sounds absolutely clear. The kernel must have a buffer before calling recv
, so it must borrow it from the ring. But with IORING_RECVSEND_POLL_FIRST
set, we won't call recv
before data has arrived.
Thanks!
Glad it got ironed out :-)
Suppose my io_uring_buf_ring has been depleted - all its buffers have been consumed by the application and not yet returned.
If I issue a regular
io_uring_prep_recv
operation withIOSQE_BUFFER_SELECT
(i.e. try using my empty bufring), it will wait until the socket receives data, and only then will trigger completionENOBUFS
. It's good because I can fallback to either calling directrecv
or useio_uring
recv without polling, knowing the socket is not empty.With multishot, however, the behavior is differrent. Once I submit
io_uring_prep_recv_multishot
request on the socket, it will immediately sendENOBUFS
completion even though the socket does not have any data.recv
and must poll on a socket again.