The code was assuming that, if io_getevents() with a NULL timeout returned 0, then there were no completed or in progress IO requests for the given io_context. The documentation does not guarantee this and, with O_DIRECT, it is not true. This caused a problem with per-thread io_contexts, as follows: When a thread was shutting down, it was using io_getevents() to wait for the completions of all its in progress I/Os. Because of the unexpected behavior of io_getevents() with O_DIRECT, the thread was erroneously concluding that it had no more in progress IOs, and was exiting. This meant there was no thread to handle the completions of those I/Os, which is necessary to things such as, e.g. clearing the writeback flag in the cache. This caused later threads trying to access those pages to block indefinitely.
The second problem is that, with per-thread io_contexts, a thread could issue IO requests and then never interact with splinter again, leaving the IO completions waiting forever to be handled. Again, this could leave pages in the writeback state, blocking other threads indefinitely.
The first fix in this PR is to explicitly track the number of in progress IOs on each io_context, and to wait for them all to complete when a thread is shutting down. This accomodates the behavior of io_getevents() with O_DIRECT.
The second fix is to switch to per-process io_contexts, i.e. all the threads in a process share a single io_context. Thus, as long as one thread in the process continues to interact with splinter, IO completions will get handled.
Note the second fix still leaves open a problem in the multi-process context: If all the threads in one process stop interacting with splinter, then the IO completions for their IOs will never get handled, which can block threads in other processes. The ideal solution would be to have a single IO context for all threads across all processes, but that's not possible. The feasible solution (not implemented in this PR) would be to have a thread in each process that just loops, handling IO completion events.
Fixes issue #620
There were two related issues:
The first fix in this PR is to explicitly track the number of in progress IOs on each io_context, and to wait for them all to complete when a thread is shutting down. This accomodates the behavior of io_getevents() with O_DIRECT.
The second fix is to switch to per-process io_contexts, i.e. all the threads in a process share a single io_context. Thus, as long as one thread in the process continues to interact with splinter, IO completions will get handled.
Note the second fix still leaves open a problem in the multi-process context: If all the threads in one process stop interacting with splinter, then the IO completions for their IOs will never get handled, which can block threads in other processes. The ideal solution would be to have a single IO context for all threads across all processes, but that's not possible. The feasible solution (not implemented in this PR) would be to have a thread in each process that just loops, handling IO completion events.