Closed MikuChan03 closed 3 months ago
I suspect this one:
https://git.kernel.dk/cgit/linux/commit/?h=io_uring-6.9&id=0a3737db8479b77f95f4bfda8e71b03c697eb56a
will fix it, it just went upstream today and should find its way into the stable kernels shortly.
@axboe Senpai, just so you know. After building the newest kernel from your linux-block repo, we don't get any cqe for reading from the timerfd at all. It's true that the mentioned patch makes it so the EGAIN error disappears, but instead we get nothing at all.
Took a quick look, and this is actually a bug in timerfd. It doesn't deal with nonblocking reads at all, properly. For any type of read, it'll always attempt to wait, even though there's data to be read. We can work around this in io_uring, but I'll fix the timerfd side as well as that is just not acceptable.
Here's the timerfd fix:
https://lore.kernel.org/linux-fsdevel/b4059ed0-5567-44e7-95f7-f7e4b227501c@kernel.dk/
And here are the io_uring fixes. This will make it work independently of the timerfd fix, but multishot mode cannot be support with timerfd not properly supporting nonblocking reads. Once the timerfd patch is in, multishot timerfd reads will work just fine:
axboe@m2max-kvm ~> ./timerfd
pipe operation
res 70 flags 3
timerfd operation
res 8 flags 10003
timerfd operation
res 8 flags 20003
timerfd operation
res 8 flags 30003
but before that and with just the below io_uring fixes, timerfd reading will be singleshot. In other words, it will never have IORING_CQE_F_MORE set, and will require re-arming every time.
https://lore.kernel.org/io-uring/20240401175306.1051122-1-axboe@kernel.dk/
Neglected to mention, multishot will work even with the limited timerfd support if you just create the timerfd with TFD_NONBLOCK. That'll work regardless of whether or not the timerfd patch is included in the kernel. So at least for now it can be a suitable work-around for dealing with timerfds in multishot mode.
Thank you such much! I'll say, since even a successful nonblocking read operation on a timerfd always sets EAGAIN and that throws off aio stuff, does that mean that as of today, no one has used timerfd in an aio context where the errno matters? That would be scawwy ._.
It'll work fine without multishot, which is a fairly new addition on the read side. Even without the timerfd fix for proper non-block reading, it won't block and it'll return the right result. It'll just be less efficient as it needs to go through io-wq, rather than being purely poll triggered. No results are being lost. FWIW, io_uring/liburing doesn't use errno at all, errors are returned directly. errno doesn't work properly for async IO.
Marking as closed, fix is queued and will go upstream later this week and bubble back to stable. Hopefully for 6.10 I'll have the timerfd fix in as well.
Hello!
When using io_uring_prep_read_multishot to read from a timerfd, io_uring_wait_cqe waits until the timerfd elapses and then returns EAGAIN in cqe->res. Using io_uring_prep_read_multishot to read from a pipe works. Using io_uring_prep_read to read from a timerfd works. I'm using only the basic functions.
Thanks.