axboe / liburing

Library providing helpers for the Linux kernel io_uring support
MIT License
2.89k stars 407 forks source link

[GIT PULL] liburing/io_uring_for_each_cqe: Pull load acquire out of for loop #1249

Closed CPestka closed 1 month ago

CPestka commented 1 month ago

I thought it was a bit odd that the macro repedatly does the atomic load. While when using it one can amortize the atomic store following the loop, I don't really see why one wouldn't want to do the same for the load, like is already being done in io_uring_peek_batch_cqe(). Maybe I'm missing smth.

This pulls the load out of the loop into a local. Not sure if this is acceptable. I gave it a long ugly name to avoid name collisions, but still. Maybe making a new macro would be better.


git request-pull output:

The following changes since commit 206650ff72b6ea4d76921f9c91ebfffd9902e6a0:

  test/fixed-hugepage: skip test on -ENOMEM (2024-09-27 10:27:10 -0600)

are available in the Git repository at:

  https://github.com/CPestka/liburing for_each_cqe

for you to fetch changes up to cb02d22dba0c6005c877c14a2cd2574a4b95462c:

  liburing: Pull load_acquire out of for loop to amortize cost (2024-09-29 14:43:43 +0200)

----------------------------------------------------------------
CPestka (1):
      liburing: Pull load_acquire out of for loop to amortize cost

 src/include/liburing.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Click to show/hide pull request guidelines ## Pull Request Guidelines 1. To make everyone easily filter pull request from the email notification, use `[GIT PULL]` as a prefix in your PR title. ``` [GIT PULL] Your Pull Request Title ``` 2. Follow the commit message format rules below. 3. Follow the Linux kernel coding style (see: https://github.com/torvalds/linux/blob/master/Documentation/process/coding-style.rst). ### Commit message format rules: 1. The first line is title (don't be more than 72 chars if possible). 2. Then an empty line. 3. Then a description (may be omitted for truly trivial changes). 4. Then an empty line again (if it has a description). 5. Then a `Signed-off-by` tag with your real name and email. For example: ``` Signed-off-by: Foo Bar ``` The description should be word-wrapped at 72 chars. Some things should not be word-wrapped. They may be some kind of quoted text - long compiler error messages, oops reports, Link, etc. (things that have a certain specific format). Note that all of this goes in the commit message, not in the pull request text. The pull request text should introduce what this pull request does, and each commit message should explain the rationale for why that particular change was made. The git tree is canonical source of truth, not github. Each patch should do one thing, and one thing only. If you find yourself writing an explanation for why a patch is fixing multiple issues, that's a good indication that the change should be split into separate patches. If the commit is a fix for an issue, add a `Fixes` tag with the issue URL. Don't use GitHub anonymous email like this as the commit author: ``` 123456789+username@users.noreply.github.com ``` Use a real email address! ### Commit message example: ``` src/queue: don't flush SQ ring for new wait interface If we have IORING_FEAT_EXT_ARG, then timeouts are done through the syscall instead of by posting an internal timeout. This was done to be both more efficient, but also to enable multi-threaded use the wait side. If we touch the SQ state by flushing it, that isn't safe without synchronization. Fixes: https://github.com/axboe/liburing/issues/402 Signed-off-by: Jens Axboe ```

By submitting this pull request, I acknowledge that:

  1. I have followed the above pull request guidelines.
  2. I have the rights to submit this work under the same license.
  3. I agree to a Developer Certificate of Origin (see https://developercertificate.org for more information).
axboe commented 1 month ago

I think this is fine to do, it'll iterate anything that was already available by the time of starting the loop.

isilence commented 1 month ago

It's changing behaviour in a nasty way, we can't have this version.

CPestka commented 1 month ago

Oh. Yeah, that is nasty. Having two initializations in the for loop instead does not work either here -,- Guess this does not work with this macro. Do You think it is worth it to add another macro that does this or do you want to keep it to one?

axboe commented 1 month ago

Ah indeed, thanks Pavel. I'll revert it.

axboe commented 1 month ago

Do You think it is worth it to add another macro that does this or do you want to keep it to one?

One would be ideal, I don't think there's any issue with only loading it upfront rather than repeatedly in the loop. There was never any guarantee on how CQEs are found, it only really guarantees that CQEs that were available at the time of the call will be iterated. Eg the app does a wait_nr(N) and then iterates. And depending on ring setup mode, there's no way that events will show up DURING the loop, unless old school setup is done (eg signal notifications for task_work).

isilence commented 1 month ago

One would be ideal, I don't think there's any issue with only loading it upfront rather than repeatedly in the loop. There was never any guarantee on how CQEs are found

Agree, that should be fine, it depends on timings anyway and might happen that we get a new batch only after finishing "for each".

Oh. Yeah, that is nasty. Having two initializations in the for loop instead does not work either here -,- Guess this does not work with this macro. Do You think it is worth it to add another macro that does this or do you want to keep it to one?

Idea 1:

for (int cached_tail = ({ head = ring.head; load_acquire((ring.tail) }, ...)

Do we require compilers to support statement expressions in liburing?

Idea 2:

struct iter {
 int head, tail;
};

for (struct iter it = {..., ...}, ...)

In which case you don't even need to pass head, so it might even be a separate helper or just a dummy parameter

axboe commented 1 month ago

I like idea 1, as it doesn't require a new helper for this. In terms of compiler support, that kind of construct has been used in the kernel for 20+ years, so I don't think we have to worry about that.

isilence commented 1 month ago

I like idea 1, as it doesn't require a new helper for this.

Not particularly a problem, you can always gracefully ignore an argument.

In terms of compiler support, that kind of construct has been used in the kernel for 20+ years, so I don't think we have to worry about that.

That's if we follow kernel requirements, we're not doing msvc, but I was still thinking about cases where the user might want some niche compiler. Maybe some heterogeneous computing like cuda, or when BPF was in discussion I was thinking making the header shareable with libbpf program (not sure if it supports it).

axboe commented 1 month ago

Not particularly a problem, you can always gracefully ignore an argument.

Sure, but then folks would need to opt-in.

I hear your point on more esoteric compilers. How about this, and please stay close to the eye wash station:

diff --git a/src/include/liburing.h b/src/include/liburing.h
index 3ad061d18dfb..512edd50d5cc 100644
--- a/src/include/liburing.h
+++ b/src/include/liburing.h
@@ -310,10 +310,11 @@ int __io_uring_get_cqe(struct io_uring *ring,
     * io_uring_smp_load_acquire() enforces the order of tail   \
     * and CQE reads.                       \
     */                             \
-   for (head = *(ring)->cq.khead;                  \
-        (cqe = (head != io_uring_smp_load_acquire((ring)->cq.ktail) ? \
-       &(ring)->cq.cqes[io_uring_cqe_index(ring, head, (ring)->cq.ring_mask)] : NULL)); \
-        head++)                            \
+   for (__u32 __HEAD__ = (head) = *(ring)->cq.khead,       \
+        __tail__ = io_uring_smp_load_acquire((ring)->cq.ktail);    \
+        (cqe = ((head) != __tail__ ?               \
+        &(ring)->cq.cqes[io_uring_cqe_index(ring, __HEAD__, (ring)->cq.ring_mask)] : NULL)); \
+        (head) = ++__HEAD__)

 /*
  * Must be called after io_uring_for_each_cqe()

which should result in identical code as to just getting rid of passing in a head iterator to begin with.

isilence commented 1 month ago

Not particularly a problem, you can always gracefully ignore an argument.

Sure, but then folks would need to opt-in.

I don't understand, the user should never use the value of head variable after the loop, so it's basically

#define for_each(ring, head, cqe)  for_each_no_head(ring, cqe)

I hear your point on more esoteric compilers. How about this, and please stay close to the eye wash station:

Or we can do whatever option and see if anyone complaints. Though it reminds me someone tried to compile it as a strict C.

diff --git a/src/include/liburing.h b/src/include/liburing.h
index 3ad061d18dfb..512edd50d5cc 100644
--- a/src/include/liburing.h
+++ b/src/include/liburing.h
@@ -310,10 +310,11 @@ int __io_uring_get_cqe(struct io_uring *ring,
   * io_uring_smp_load_acquire() enforces the order of tail   \
   * and CQE reads.                       \
   */                             \
- for (head = *(ring)->cq.khead;                  \
-      (cqe = (head != io_uring_smp_load_acquire((ring)->cq.ktail) ? \
-     &(ring)->cq.cqes[io_uring_cqe_index(ring, head, (ring)->cq.ring_mask)] : NULL)); \
-      head++)                            \
+ for (__u32 __HEAD__ = (head) = *(ring)->cq.khead,       \
+      __tail__ = io_uring_smp_load_acquire((ring)->cq.ktail);    \
+      (cqe = ((head) != __tail__ ?               \
+      &(ring)->cq.cqes[io_uring_cqe_index(ring, __HEAD__, (ring)->cq.ring_mask)] : NULL)); \
+      (head) = ++__HEAD__)

 /*
  * Must be called after io_uring_for_each_cqe()

which should result in identical code as to just getting rid of passing in a head iterator to begin with.

Any would do IMHO. nit: I don't think you need to assign head.

axboe commented 1 month ago

Just trying to avoid needing to do another helper... We have to assign head or some compilers will (rightfully) complain that head is only ever being written, never read. Because I'd much rather just get rid of head entirely, but imho preferable to live with the existing version over having two versions of this.