axboe / liburing

Library providing helpers for the Linux kernel io_uring support
MIT License
2.83k stars 402 forks source link

Surprising behavior of `IORING_OP_TIMEOUT` with duplicated user_data (timers). #1120

Closed vsolontsov-ll closed 2 weeks ago

vsolontsov-ll commented 6 months ago

Scheduling two timeouts with the same user_data and zero count is a bit strange. I'd expect to either reject the second (if the first one is not expired yet) or override, while it schedules both. Is this intended?

The IORING_OP_TIMEOUT_REMOVE identifies the timeouts by the user_data value, so this could be interpreted in a way that it should be unique. Should it be documented somehow? I assume the easiest way is to say it's implementation specific and may be changed in the future.

Meanwhile it would be great to have a straightforward way to override the timeout forcefully. The IORING_OP_TIMEOUT_REMOVE + IORING_TIMEOUT_UPDATE is racy in a sense that it may fail just because of the original timer has expired. So in my usecase I got to enqueue two operations, IORING_OP_TIMEOUT_REMOVE + IORING_OP_TIMEOUT.

isilence commented 6 months ago

io_uring is not tracking user_data, that's just a token returned back in CQEs, and it's entirely up to the user how it's used.

isilence commented 6 months ago

The IORING_OP_TIMEOUT_REMOVE identifies the timeouts by the user_data value, so this could be interpreted in a way that it should be unique. Should it be documented somehow? I assume the easiest way is to say it's implementation specific and may be changed in the future.

user_data is never required to be unique but often reasonably expected to be so. Just as in your case, you can queue up two such timeouts and IORING_OP_TIMEOUT_REMOVE would remove a random one of them. However, apps most likely would want to have two different user_data to distinguish between requests.

Meanwhile it would be great to have a straightforward way to override the timeout forcefully. The IORING_OP_TIMEOUT_REMOVE + IORING_TIMEOUT_UPDATE is racy in a sense that it may fail just because of the original timer has expired.

Which sounds absolutely fine to me, you'd need to rearm the timeout then.

So in my usecase I got to enqueue two operations, IORING_OP_TIMEOUT_REMOVE + IORING_OP_TIMEOUT.

Note, that if you need a periodic timeout, you can use the multishot mode. i.e. you submit a single request and it posts a CQE every N [u,n]sec until you cancel it.

vsolontsov-ll commented 6 months ago

Thanks for the prompt answer.

user_data is never required to be unique but often reasonably expected to be so. Just as in your case, you can queue up two such timeouts and IORING_OP_TIMEOUT_REMOVE would remove a random one of them.

You see, I'm raising this because after reading the man pages I had no clue what would be the behavior. I had two assumptions (override or reject) but experiment showed the third one :-). If this is the intended behavior, would be great to specify it (in man).

The IORING_OP_TIMEOUT_REMOVE + IORING_TIMEOUT_UPDATE

Which sounds absolutely fine to me, you'd need to rearm the timeout then.

In this particular use-case I know for sure that I want a timeout at T1 instead of T0 I armed before. So the possible async error would add some complexity. Would be nice to have a more direct way to express such a demand in code. Yet it's O.K. to cancel+add the timer.

Note, that if you need a periodic timeout, you can use the multishot mode.

Great news, thanks for mentioning! I do see it in the header but seems to be not in man yet.

isilence commented 5 months ago

user_data is never required to be unique but often reasonably expected to be so. Just as in your case, you can queue up two such timeouts and IORING_OP_TIMEOUT_REMOVE would remove a random one of them.

You see, I'm raising this because after reading the man pages I had no clue what would be the behavior. I had two assumptions (override or reject) but experiment showed the third one :-). If this is the intended behavior, would be great to specify it (in man).

That's how all requests work (I believe should be described in some core / non opcode specific man doc) and nothing suggests otherwise. Can be added though if there is a confusion.

The IORING_OP_TIMEOUT_REMOVE + IORING_TIMEOUT_UPDATE

Which sounds absolutely fine to me, you'd need to rearm the timeout then.

In this particular use-case I know for sure that I want a timeout at T1 instead of T0 I armed before. So the possible async error would add some complexity. Would be nice to have a more direct way to express such a demand in code. Yet it's O.K. to cancel+add the timer.

Note, that if you need a periodic timeout, you can use the multishot mode.

Great news, thanks for mentioning! I do see it in the header but seems to be not in man yet.

Yeah, man pages happen to lag behind, unfortunately

vsolontsov-ll commented 5 months ago

Can be added though if there is a confusion.

Yeah, would be great. Please let me know if you prefer me creating a separate request for the man update. This one can be closed unless you prefer to keep it for some tracking.

Thanks you very much!