axboe / liburing

Library providing helpers for the Linux kernel io_uring support
MIT License
2.86k stars 402 forks source link

Improvement: Better IORING_MSG_SEND_FD interface #809

Closed evyatark2 closed 3 weeks ago

evyatark2 commented 1 year ago

The current interface for sending uring-registered files between different urings is rather awkward if you plan to send and receive files from multiple urings (i.e. not in a single-producer-multi-consumer fashion), as before you send the file you need to communicate with the receiver as to which array index the file should be installed.

I propose to defer the choice of the array index in which the file will be installed to the receiver using the following API:

This API has the following advantages:

This API can even be extended to plain IORING_MSG_DATA to maintain the second point

isilence commented 1 year ago

Instead of specifying to which index you want to install a file you can pass IORING_FILE_INDEX_ALLOC and the kernel will choose an empty slot for you. Note, it's probably IORING_FILE_INDEX_ALLOC - 1 if you use helpers like io_uring_prep_msg_ring_fd. It should cover it.

isilence commented 1 year ago

Also, if you want to restrict indexes for slot auto-selection, i.e. IORING_FILE_INDEX_ALLOC, you can use IORING_REGISTER_FILE_ALLOC_RANGE or a helper around it called io_uring_register_file_alloc_range().

evyatark2 commented 1 year ago

The problem with using IORING_FILE_INDEX_ALLOC in the current implementation is that you can't know to which slot the kernel installed the file into.

evyatark2 commented 1 year ago

I totally forgot about this and I would like to modify my interface with len not being set to the other side's res, but instead res will be set to the file index

isilence commented 1 year ago

The problem with using IORING_FILE_INDEX_ALLOC in the current implementation is that you can't know to which slot the kernel installed the file into.

Indeed, if that's the case then we should just return the index in cqe->res, I'll take a look

evyatark2 commented 1 year ago

Also note that if -EOVERFLOW occurs then cqe->res should be set to the index in the sender

isilence commented 1 year ago

Also note that if -EOVERFLOW occurs then cqe->res should be set to the index in the sender

That's what it does, but I don't think we have a test for that. It'd be a great contribution if you (or anyone else) want to submit a test case!

P.S. tests for fd passing msg-ring are in test/fd-pass.c and for normal ones in test/msg-ring.c

isilence commented 1 year ago

We queued up patch, now for auto allocation it'll return the selected index in the target's cqe->res. It'll hit upstream in some time and get backported after.

evyatark2 commented 1 year ago

What I meant was that if the target's completion queue is full then a -EOVERFLOW CQE will be posted on the sender's end. Instead, the res field of this CQE should be set to the index of the file, as according to this comment in msg_ring.c:

This means that if this request completes with -EOVERFLOW, then the sender must ensure that a later RING_OP_MSG_RING delivers the message

But in order to notify the target of the new file we need to actually know the index in order to communicate this information. I can go ahead and implement that but I need to know what strategy to use: