Closed stevenengler closed 3 years ago
There is a complication here because there are some sockets can still be read after they are closed, for example, if they still have data to read sitting in the buffer.
I just found that if an fd is first closed, and then epoll_ctl(EPOLL_CTL_DEL)
is called on that fd, Shadow thinks that the fd is invalid and will not tell epoll to remove the fd from the watching table.
I just found that if an fd is first closed, and then
epoll_ctl(EPOLL_CTL_DEL)
is called on that fd, Shadow thinks that the fd is invalid and will not tell epoll to remove the fd from the watching table.
I think that's the standard Linux behaviour. You need to EPOLL_CTL_DEL
before closing the fd.
https://idea.popcount.org/2017-03-20-epoll-is-fundamentally-broken-22/
Cool. In my current debug session, it looks like Tor (or one of its dependencies) is doing things in the incorrect order (close()
and then EPOLL_CTL_DEL
).
Cool. In my current debug session, it looks like Tor (or one of its dependencies) is doing things in the incorrect order (
close()
and thenEPOLL_CTL_DEL
).
That might be this:
https://github.com/libevent/libevent/blob/852af060af7c7e439f0db97829ec74bedeba64d5/epoll.c#L375-L386
Looks like we can just add an if statement after the child is removed from the ready table here, to then also check if the child status is CLOSED to then also remove it from the waiting table too.
We probably also want to create a test case.
@robgjansen There's a comment in Shadow that refers to this issue:
Do you remember why this was added (in #1169)? This issue is closed now, so maybe we can remove this block? Why is this only for legacy descriptors, and what's special about EPOLL_CTL_DEL
? I don't think it should be possible to have a file handle (an fd) to a closed file object?
It looks like the && op != EPOLL_CTL_DEL
part was added because it was causing errors in libevent/tor, but that should no longer be a problem anymore now that #1101 is fixed. So I think we should be able to remove the && op != EPOLL_CTL_DEL
part.
I'm not sure if we can delete the entire if {}
block. Don't we still want to call _syscallhandler_validateLegacyFile
on the child fd? Or do you think that block is dead code?
I'm not sure if we can delete the entire
if {}
block. Don't we still want to call_syscallhandler_validateLegacyFile
on the child fd? Or do you think that block is dead code?
I thought it would be dead code, but it looks like TCP sockets set the CLOSED state themselves even if there are still file handles. So I don't think we can get rid of this block until we rewrite the TCP code.
When a descriptor/file that an epoll object is watching closes, the epoll object doesn't remove it from its
watching
table. If the descriptor isn't removed usingEPOLL_CTL_DEL
, then the epoll object will never unref that descriptor/file and it will remain in thewatching
table until the epoll descriptor itself is closed.I think when
_epoll_descriptorStatusChanged()
runs, it should check if the descriptor/file has been closed (theEpollWatch
flagEWF_CLOSED
is set) and if so remove it from thewatching
table.