Closed mkataja closed 8 years ago
This sounds like an error on the BSD sockets implementation. What is the error that happens down the line? How do other socket-based libraries deal with this situation?
I'm closing this for now, but feel free to revive the discussion .
I could finally reproduce this. I'm running commands in threads and seems like I managed to create a race condition by calling jump_server on SingleServerIRCBot that way (I simply hadn't paid attention to whether SingleServerIRCBot is thread-aware or not). I suspect the socket might be in the process of closing (due to jump_server) in a worker thread while the main thread enters process_once and sees that socket while it's in a bad state.
Unfortunately I can't tell if this is the full story since I don't have logs from when I originally encountered this issue.
The sockets property sometimes returns sockets that have fileno < 0, which causes an error down the line. Since this doesn't happen all that often, I've not been able to debug the issue and its cause. I originally found the issue more than a year ago, so it's possible it has been patched since, but I didn't find anything relevant in the changelogs.
While I've successfully monkey-patched this in my client by only returning sockets with a non-negative fileno, the root cause obviously lies somewhere else (closed connections not getting cleaned up for some reason?).
My trivial patch for reference: