Closed CPestka closed 1 month ago
I think this is fine to do, it'll iterate anything that was already available by the time of starting the loop.
It's changing behaviour in a nasty way, we can't have this version.
Oh. Yeah, that is nasty. Having two initializations in the for loop instead does not work either here -,- Guess this does not work with this macro. Do You think it is worth it to add another macro that does this or do you want to keep it to one?
Ah indeed, thanks Pavel. I'll revert it.
Do You think it is worth it to add another macro that does this or do you want to keep it to one?
One would be ideal, I don't think there's any issue with only loading it upfront rather than repeatedly in the loop. There was never any guarantee on how CQEs are found, it only really guarantees that CQEs that were available at the time of the call will be iterated. Eg the app does a wait_nr(N) and then iterates. And depending on ring setup mode, there's no way that events will show up DURING the loop, unless old school setup is done (eg signal notifications for task_work).
One would be ideal, I don't think there's any issue with only loading it upfront rather than repeatedly in the loop. There was never any guarantee on how CQEs are found
Agree, that should be fine, it depends on timings anyway and might happen that we get a new batch only after finishing "for each".
Oh. Yeah, that is nasty. Having two initializations in the for loop instead does not work either here -,- Guess this does not work with this macro. Do You think it is worth it to add another macro that does this or do you want to keep it to one?
Idea 1:
for (int cached_tail = ({ head = ring.head; load_acquire((ring.tail) }, ...)
Do we require compilers to support statement expressions in liburing?
Idea 2:
struct iter {
int head, tail;
};
for (struct iter it = {..., ...}, ...)
In which case you don't even need to pass head, so it might even be a separate helper or just a dummy parameter
I like idea 1, as it doesn't require a new helper for this. In terms of compiler support, that kind of construct has been used in the kernel for 20+ years, so I don't think we have to worry about that.
I like idea 1, as it doesn't require a new helper for this.
Not particularly a problem, you can always gracefully ignore an argument.
In terms of compiler support, that kind of construct has been used in the kernel for 20+ years, so I don't think we have to worry about that.
That's if we follow kernel requirements, we're not doing msvc, but I was still thinking about cases where the user might want some niche compiler. Maybe some heterogeneous computing like cuda, or when BPF was in discussion I was thinking making the header shareable with libbpf program (not sure if it supports it).
Not particularly a problem, you can always gracefully ignore an argument.
Sure, but then folks would need to opt-in.
I hear your point on more esoteric compilers. How about this, and please stay close to the eye wash station:
diff --git a/src/include/liburing.h b/src/include/liburing.h
index 3ad061d18dfb..512edd50d5cc 100644
--- a/src/include/liburing.h
+++ b/src/include/liburing.h
@@ -310,10 +310,11 @@ int __io_uring_get_cqe(struct io_uring *ring,
* io_uring_smp_load_acquire() enforces the order of tail \
* and CQE reads. \
*/ \
- for (head = *(ring)->cq.khead; \
- (cqe = (head != io_uring_smp_load_acquire((ring)->cq.ktail) ? \
- &(ring)->cq.cqes[io_uring_cqe_index(ring, head, (ring)->cq.ring_mask)] : NULL)); \
- head++) \
+ for (__u32 __HEAD__ = (head) = *(ring)->cq.khead, \
+ __tail__ = io_uring_smp_load_acquire((ring)->cq.ktail); \
+ (cqe = ((head) != __tail__ ? \
+ &(ring)->cq.cqes[io_uring_cqe_index(ring, __HEAD__, (ring)->cq.ring_mask)] : NULL)); \
+ (head) = ++__HEAD__)
/*
* Must be called after io_uring_for_each_cqe()
which should result in identical code as to just getting rid of passing in a head iterator to begin with.
Not particularly a problem, you can always gracefully ignore an argument.
Sure, but then folks would need to opt-in.
I don't understand, the user should never use the value of head
variable after the loop, so it's basically
#define for_each(ring, head, cqe) for_each_no_head(ring, cqe)
I hear your point on more esoteric compilers. How about this, and please stay close to the eye wash station:
Or we can do whatever option and see if anyone complaints. Though it reminds me someone tried to compile it as a strict C.
diff --git a/src/include/liburing.h b/src/include/liburing.h index 3ad061d18dfb..512edd50d5cc 100644 --- a/src/include/liburing.h +++ b/src/include/liburing.h @@ -310,10 +310,11 @@ int __io_uring_get_cqe(struct io_uring *ring, * io_uring_smp_load_acquire() enforces the order of tail \ * and CQE reads. \ */ \ - for (head = *(ring)->cq.khead; \ - (cqe = (head != io_uring_smp_load_acquire((ring)->cq.ktail) ? \ - &(ring)->cq.cqes[io_uring_cqe_index(ring, head, (ring)->cq.ring_mask)] : NULL)); \ - head++) \ + for (__u32 __HEAD__ = (head) = *(ring)->cq.khead, \ + __tail__ = io_uring_smp_load_acquire((ring)->cq.ktail); \ + (cqe = ((head) != __tail__ ? \ + &(ring)->cq.cqes[io_uring_cqe_index(ring, __HEAD__, (ring)->cq.ring_mask)] : NULL)); \ + (head) = ++__HEAD__) /* * Must be called after io_uring_for_each_cqe()
which should result in identical code as to just getting rid of passing in a head iterator to begin with.
Any would do IMHO. nit: I don't think you need to assign head
.
Just trying to avoid needing to do another helper... We have to assign head
or some compilers will (rightfully) complain that head
is only ever being written, never read. Because I'd much rather just get rid of head
entirely, but imho preferable to live with the existing version over having two versions of this.
I thought it was a bit odd that the macro repedatly does the atomic load. While when using it one can amortize the atomic store following the loop, I don't really see why one wouldn't want to do the same for the load, like is already being done in io_uring_peek_batch_cqe(). Maybe I'm missing smth.
This pulls the load out of the loop into a local. Not sure if this is acceptable. I gave it a long ugly name to avoid name collisions, but still. Maybe making a new macro would be better.
git request-pull output:
Click to show/hide pull request guidelines
## Pull Request Guidelines 1. To make everyone easily filter pull request from the email notification, use `[GIT PULL]` as a prefix in your PR title. ``` [GIT PULL] Your Pull Request Title ``` 2. Follow the commit message format rules below. 3. Follow the Linux kernel coding style (see: https://github.com/torvalds/linux/blob/master/Documentation/process/coding-style.rst). ### Commit message format rules: 1. The first line is title (don't be more than 72 chars if possible). 2. Then an empty line. 3. Then a description (may be omitted for truly trivial changes). 4. Then an empty line again (if it has a description). 5. Then a `Signed-off-by` tag with your real name and email. For example: ``` Signed-off-by: Foo BarBy submitting this pull request, I acknowledge that: