Closed alzix closed 5 months ago
This looks like a good find. Furthermore, it looks like nni_aio_abort also suffers from the same flaw.
I want to look at this in more detail later today before I move forward.
I think I've convinced myself that this is precisely the right fix, and we just need to add the same change to the implementation of nni_aio_abort.
i'm on it
So there are other callers.
Basically we also need the same logic in nni_aio_cancel, and I think nni_aio_fini. It looks like it was missed in all the paths where we tear down or abort an aio.
it does not solve the Windows issue though :( On Windows the issue seems to be different
it seems that on Windows it is a TOCTOU issue in the nni_sock_shutdown
it is stuck on
// We have to wait for pipes to be removed.
while (!nni_list_empty(&sock->s_pipes)) {
=> nni_cv_wait(&sock->s_cv);
}
while s_pipes
is already empty
the issue predates the fix - so it is better to address it in another PR
I'm merging this... the hang waiting for pipes to be empty feels like a missed cv_wake somewhere. I'll look for it later. I'm out of time for today.
created a new ticket for this #1827
when an
aio
has noa_cancel_fn
and the task is intask_prep
state abort it onnni_aio_stop
callfixes #1813 Deadlock during nng_close() - multi platform