ringbahn / iou

Rust interface to io_uring
Apache License 2.0
328 stars 22 forks source link

Always filter timeouts #24

Closed withoutboats closed 4 years ago

withoutboats commented 4 years ago

I think we should always filter timeouts generated by liburing and make the is_timeout method private. I'd do this as a part of #22.

I can't think of a reasonable way that users would want to respond to to these timeouts. It would be simpler to just always ignore them. They're not timeouts the users prepared and submitted, they're only those generated by liburing as a part of calling submit or wait with timeout methods.

Anyone have any thoughts?

withoutboats commented 4 years ago

The big issue with this is that timeouts can have error results, but this would just drop those errors instead of allowing the user to somehow respond to them. A way to solve that would be to change the API of (for example) peek to also include reporting a timeout-originating error as a possibility.

mxxo commented 4 years ago

I'm not super familiar with the with_timeout functions, but can't this be solved using partition on the cqe iterator?

withoutboats commented 4 years ago

You can't literally use partition because the CQE iterator doesn't implement Iterator, but the basic principle of what you're saying is true. With the system right now, a user can handle uring-raised timeouts with:

cq.peek_for_cqes().for_each(|cqe| {
    if cqe.is_timeout() { /* handle timeout */ }
    else { /* handle normal CQE */ }
})

22 also contains an adapter to easily ignore all uring raised timeouts:

cq.peek_for_cqes().filter_timeouts(true).for_each(|cqe| {
    // handle normal CQE
})

However, I see a few issues with this. The first is that is_timeout is pretty ambiguous - it specifically means "is it a timeout generated by liburing with its "with_timeout" methods?" not "is it a timeout?" (timeouts the user preps and submits themselves will not have LIBURING_UDATA_TIMEOUT as their user_data field).

The other is that its very easy to forget or not realize that you need to be handling liburing-raised timeouts specially, and then just assume all the user_data fields are something you yourself have set. For example, a common pattern will be to set a pointer to some waker data as the user_data of any SQEs your program submits, and then call a "wake" routine on that pointer. But if you didn't filter out timeouts, you'll end up calling it on values of LIBURING_UDATA_TIMEOUT (incidentally this is u64::MAX) and segfaulting your program.

I think a better API would be to provide some sort of adapter for defining the callback you will use for specifically handling the results from timeouts. That will receive an FnMut(io::Result<()>). If this API is not used, the results of timeouts are ignored by default.

As an example, here is how you might use the API in full as I've described:

cq.peek_for_cqes().handle_timeouts(|result| if let Err(err) = result {
     log!(err);
}).for_each(|cqe| {
    unsafe { wake(cqe.user_data() as *mut MyWaker, cqe.result()) }
})

// unsafe API for waking tasks after their IO completes:
unsafe fn wake(waker: *mut MyWaker, result: io::Result<usize>) { ... }
withoutboats commented 4 years ago

I've implemented the API as described above in #22.

mxxo commented 4 years ago

Thank you for explaining that, I didn't understand the subtleties of this issue. This looks like a good way forward.