axboe / liburing

Library providing helpers for the Linux kernel io_uring support
MIT License
2.7k stars 393 forks source link

How to ensure the sequential nature of tasks such as storage I/O? #1156

Closed aidosww closed 1 month ago

aidosww commented 1 month ago

Hi, here i have a small question, i am using kernel 5.10 and the default io_uring configuration. Add tasks to read file data to the io_uring queue in turn, and submit them at one time with the lo_uring_submit interface when the queue is full. So, is the CQE order returned by the io_uring_wait_cqe interface consistent with the order of adding tasks to the io_uring queue?

toziegler commented 1 month ago

Hi,

maybe I can help out here.

so my interpretation of the man page io_uring_enter and this link https://unixism.net/loti/tutorial/link_liburing.html is that completions are not in the submission order.

Note that, while I/O is initiated in the order in which it appears in the
submission queue, completions are unordered. For example,
an application which places a write I/O followed by an fsync in
the submission queue cannot expect the fsync to apply to the write.

This is consistent with my experiments (and many other async IO engines) and how SSDs work (highly parallel). I think if you want to force a strict order then you would need to link the requests as described in the link above.

aidosww commented 1 month ago

Hi,

maybe I can help out here.

so my interpretation of the man page io_uring_enter and this link https://unixism.net/loti/tutorial/link_liburing.html is that completions are not in the submission order.

Note that, while I/O is initiated in the order in which it appears in the
submission queue, completions are unordered. For example,
an application which places a write I/O followed by an fsync in
the submission queue cannot expect the fsync to apply to the write.

This is consistent with my experiments (and many other async IO engines) and how SSDs work (highly parallel). I think if you want to force a strict order then you would need to link the requests as described in the link above.

Thank you very much! Your opinion made me suddenly enlightened. When I use the link flag for all the SQEs in the queue, and then submit them together when the queue is full, can I ensure that the returned CQE is in the order when added to the queue?

toziegler commented 1 month ago

Hi,

based on the man page my interpretation is that you will get the CQEs in submission order:

When this flag is specified, the SQE forms a link with the
next SQE in the submission ring. That next SQE will not be
started before the previous request completes. This, in
effect, forms a chain of SQEs, which can be arbitrarily
long.

But note that (1) a chain cannot be formed across submission boundaries (see IOSQE_IO_LINK io_uring_enter ) and (2) if you have submissions that are not part of the chain they might complete in between your chain. But in your example I would suspect that you will get them all in order.

redbaron commented 1 month ago

keep in mind that by linking you limit IO queue depth to 1. It might be better to submit SQEs without linking and just track number of competions: when CQEs seen == SQEs sent all IO is done, this way io requests could be done in parallel even if desired outcome is read continuous range from file

aidosww commented 1 month ago

Hi,

based on the man page my interpretation is that you will get the CQEs in submission order:

When this flag is specified, the SQE forms a link with the
next SQE in the submission ring. That next SQE will not be
started before the previous request completes. This, in
effect, forms a chain of SQEs, which can be arbitrarily
long.

But note that (1) a chain cannot be formed across submission boundaries (see IOSQE_IO_LINK io_uring_enter ) and (2) if you have submissions that are not part of the chain they might complete in between your chain. But in your example I would suspect that you will get them all in order.

Thanks a lot!

aidosww commented 1 month ago

keep in mind that by linking you limit IO queue depth to 1. It might be better to submit SQEs without linking and just track number of competions: when CQEs seen == SQEs sent all IO is done, this way io requests could be done in parallel even if desired outcome is read continuous range from file

Thanks a lot!